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INFLUENCE OF THE INTERVIEWER ON THE ACCURACY 
OF SURVEY RESULTS* 


Rosert H. Hanson 
Bureau of the Census 
AND 
Eur 8. Marxs 
National Analysis, Inc. 
AND 
National Opinion Research Center 

This paper reports results of a large scale study of the effect of inter- 
viewers on survey results. Where significant effects of the interviewer 
upon the results are found, the important factors appear to be (1) 
interviewer “resistance” to a given question—i.e., a tendency to omit 
or alter the question and/or to assume the answer; (2) relatively high 
ambiguity, “subjectivity,” or complexity in the concept or wording of 
the inquiry; (3) the degree to which additional questioning (“probing”) 
tends to alter initial respondent replies. 

The study also investigated the relationship of interviewer perform- 
ance to interviewer characteristics as given by (1) scores on a number 
of tests; (2) by such personal characteristics as age, sex, occupation, 
education; and (3) by attitudes toward, and expectations of respondent 
reactions. At least one measure of poor quality of enumeration (number 
of improperly omitted Census entries) seems to show substantial cor- 
relation with age, with some test scores, and with expectations about 
respondent cooperation. 


1, INTRODUCTION 


HE DATA on which this report is based are from a study of interviewer varia- 
tion conducted by the U. S. Bureau of the Census as a part of the 1950 
Censuses of Population, Housing, and Agriculture. While the Enumerator 
Variance Study actually covered all three Censuses, the part that has been 
tabulated deals with the Population Census only. The concept behind the 
Enumerator Variance Study of the Population Census of 1950 is not a new one, 
nor is the technique employed essentially novel. It is basically an analysis of 
variance method in which variation in results obtained by different interviewers 
is compared with the variation between respondents within the same interviewer. 
One feature which gives trouble in studies of this design is the difficulty in 
randomizing the respondents as between interviewers. That is, in applying an 
analysis of variance, the null hypothesis is, of course, that the different inter- 
* This study wae sponsored jointly by the United States Bureau of the Census and The National Opinion 


Research Center and was supported, in part, by a grant of funds under the Behavioral Sciences Program of The 
Ford Foundation. 
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viewers would obtain results having the same expected value. In general, except 
in laboratory situations, we cannot study the situation in which the interviewers 
should obtain results which are identical and not subject to sampling variance. 
The problem is that interviewing is a “destructive process”—it does not, of 
course, destroy the respondent but does destroy his pristine innocence and 
naivete. This, at least, is the claim that has been made by those who object to 
obtaining a measure of interviewer variance by reinterviewing the same re- 
spondent and to some extent, we tend to agree with this position. There is cer- 
tainly some merit to the position that respondents will tend to remember 
responses given on a first interview and to repeat these responses at a reinter- 
view. 

If we deal with a design in which a respondent is interviewed only once, we are 
forced to consider other methods of securing identical expected values for two 
or more interviewers (identical, that is, in the absence of real interviewer effect 
upon the results of the interview). Apart from interviewer effects, the results ob- 
tained by two different interviewers will differ because of differences in the 
respondents. This problem can be handled statistically by randemizing re- 
spondents among interviewers. While this statistical solution presents no great 
theoretical problem, it does present very substantial difficulties in practical 
application. 

If all that is wanted is a comparison of two interviewers with each other, the 
problem is simple enough, provided matters can be arranged so that the inter- 
viewers can be given random assignments from the same population. But this 
procedure gives information only about these particular interviewers and not 
anything about interviewers in general or about any general class of inter- 
viewers. When we are dealing only with particular interviewers, we need be 
concerned statistically only with the number of respondents and if this number 
is sufficiently large (i.e., if we can estimate within interviewer variance with 
high accuracy), we can compare our two interviewers adequately. When, how- 
ever, we are dealing with a larger population of interviewers and the particular 
interviewers we are studying are only a sample from that larger population, we 
have a problem in which both interviewers and respondents are sampled and 
we must estimate adequately between interviewer variance as well as within 
interviewer variance. 

Excellently designed studies of interviewer variation have been conducted 
by a great many research workers—notably Mahalanobis [7], Hyman and 
associates [2], and Hochstim and Stock [5]. All of the workers in question went 
to considerable effort to randomize the assignments of the interviewers studied 
and all obtained valid measures of the effect of these particular interviewers on 
study results. With a relatively small number of degrees of freedom, important 
interviewer contributions may not be detected, and there is still the question of 
whether the absence or presence of differences among interviewers is a feature 
of interviewing in general-or is an idiosyncrasy of the particular interviewers 
studied. This is the old problem of when does a negative result indicate a lack 
of relationship and when does it indicate only a big sampling variance. 

To get away from the limitations on the number of interviewers included in 
a study of interviewer variation, it is almost essential to go to a census or very 
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large-scale survey. It is true that small studies may frequently use as many as 
50 to 200 interviewers. However, studies which use this number of interviewers 
are usually national in scope with possibly two interviewers in 9 or 10 large 
metropolitan areas (perhaps also 3 in Chicago and 9 or 10 in New York) and one 
in other primary sampling units. In practice it is not feasible to randomize the 
assignments of interviewers to PSU’s where you have only one interviewer. 
In the areay where there is more than one interviewer, assignments can be 
randomized within the area but one degree of freedom is lost for each area. 
Thus, we are limited, in most studies, to something like a maximum of 40 
interviewers with only about 20 to 25 degrees of freedom for the differences be- 
tween interviewers. In actual fact, even 25 is more than the number of degrees 
of freedom used in the usual study of interviewer variation. Mahalanobis’ [7] 
important pioneering studies of interviewer variance used less than 10 inter- 
viewers. In the Enumerator Variance Study of the 1950 Census of Population, 
it was possible to randomize assignments for some 1000 interviewers within 134 
assignment areas. 

Using a Census for the study of interviewer variation does involve special 
problems in randomizing interviewers’ assignments. The requirement that a 
Census obtain complete coverage of the population—i.e., a thorough canvass 
of all areas in the United States—means that individual respondents cannot 
be assigned to different interviewers independently. Instead we must assign 
groups of respondents—enumeration districts—as a unit. For the areas included 
in the EVS, enumeration districts were defined to contain approximately 550 


persons and the Census Enumerator Variation Study randomized enumeration 
districts among interviewers and not individual respondents. This reduces the 
number of degrees of freedom within interviewers but probably not to an ex- 
tent where it is of any great consequence. 


2. THE MODEL 


Consider a population of enumeration districts (ED’s) all of which are to be 
enumerated in a Census. The population of ED’s is presumed to be within a 
limited geographic area such that a particular Census interviewer could rea- 
sonably be assigned to enumerate any one of the ED’s. In addition, we have 
available a sample of interviewers presumed to have been drawn at random from 
an infinite population of available persons in or near the area and having the 
qualifications of a Census interviewer. By a random process, each sample inter- 
viewer is assigned one ED and on its completion is assigned a second and, if 
necessary, a third in order to exhaust the population of ED’s to be covered. This 
is onopmen the same model as formulated by Hansen-Hurwitz-Marks-Maul- 
din |3}. 

With such a model, it is possible to write estimators for the variance among 
ED’s within interviewer assignments and the variance between interviewer 
assignments such that the value of these two variance estimates are independent 
and their expected values are identical under the null hypothesis, the null 
hypothesis being that the expected value for any ED is the same tegardless of 
the interviewer. With this design any ED is a sample from which the population 
total may be estimated. If the distribution of these estimated totals is approxi- 
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mately normal, the sums of squares will be distributed approximately as chi- 
square leading to the use of the F test for measuring significance of interivewer 
contributions. 

The F-distribution is reasonably applicable to results for a single assignment 
area, but when the model is extended to include results for a number of assign- 
ment areas (strata) as was done in our experiment, an F-distribution is ap- 
propriate only if strata estimates are such that they have equal variances. The 
stratum estimates actually used may have variances which are far from equal. 
The work of Box [1] indicates that in these circumstances the computed F-ra- 
tios will have the F-distribution but with a reduced number of degrees of freedom. 


3. THE EXPERIMENT 


The experiment was carried out in 21 counties in Ohio and Michigan having 
an aggregate population of about 1} million people. A national sample would 
have been preferable from the standpoint of design and general applicability of 
results obtained. However, the problem of insuring execution of such a design 
is not a simple one and the advantages of direct supervision by the central 
staff of the Census Bureau made it desirable to restrict the geographic spread. 

The population of the survey area was about 55 per cent urban with about 
5 per cent of the population nonwhite. The largest cities in the area were 
Dayton, Hamilton, Zanesville, Ohio, and Saginaw and St. Joseph, Michigan. 
The ED’s in these areas were made about one-third smaller in size than the 
regular Census ED so that assignment of more than one ED to an interviewer 
would be feasible. The 200 interviewer assignment areas were defined so that 
each area would be as large as possible within the limitations imposed by ad- 
ministrative criteria of travel and supervision and availability of interviewers. 
Within the assignment areas, ED’s were assigned at random to the interview- 
ers. The interviewers were required to complete their assigned ED’s in the 
(random) order in which they were assigned. 

Interviewers in the experiment were selected under the same qualification 
standards as in the regular Census. Qualification standards (not always rigor- 
ously enforced) called for a high school education, passing a qualifying examina- 
tion referred to as the Enumerator Selection Aid Test, and physical ability to 
perform an interviewer’s work. Political recommendation or endorsement was 
typically required for interviewers except where a sufficient number of qualified 
interviewers could not be obtained in this manner. The training consisted of 16 
hours spread over four days if the assignment consisted only of urban ED’s or 
24 hours over five days if the assignment contained a rural ED. For the study of 
interviewer characteristics, the experimental interviewers completed additional 
tests and questionnaires not given to the regular Census interviewers. 

The experiment intially consisted of about 1400 interviewers and 2500 ED’s 
in 200 assignment areas. It was necessary, however, to eliminate some of these 
areas usually for one or more of the following reasons: (a) In some cases, the 
random assignment of an ED to an interviewer was not followed due to non- 
random assignments in some instances or in the failure of the interviewer to 
complete an assigned ED. If an interviewer completed his first ED but refused 
a second, he was retained in the experiment, but his second ED was then 
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assigned to another interviewer in the same area; (b) All ED’s containing large 
hotels, dormitories or institutions were left out. These factors reduced the num- 
bers in the experiment to 984 interviewers covering 1778 ED’s in 134 assignment 
areas. It is on this universe that our analysis of interviewer performance is 
based. For computations of the F-ratios used, the universe was further reduced, 
by eliminating all interviewers not completing two or more ED’s, to 705 imter- 
viewers doing 1489 ED’s in 125 areas. 


4. GENERAL LIMITATIONS OF RESULTS 


There are certain factors of our experiment that must be considered in gen- 
eralizing from these results to statements about variance contributions to be 
expected from interviewers in other surveys. 

(a) A frequently raised objection to the use of Census data for the study of 
interviewer variability is that the Census data are factual and the data to be 
collected are essentially simple so that one would expect very little influence 
of the interviewers on the results. In practice, we would challenge these state- 
ments. With respect to some information—for example, age—the statements 
may be approximately correct but for most kinds of information it is our feeling 
that the lines between fact and opinion and between subjective and objective 
are far more hazy than is commonly recognized. If you read the Census manuals, 
you will discover that while a person’s age may be a matter of fact—whether he 
is looking for work last week is largely a matter of opinion. In any event we 
found that the interviewer contribution to the variance of the Census for the 
various questions on the Census schedule ranged from the trivial to the highly 
significant. 

(b) The method of selecting interviewers has already been described. In our 
judgment, the political aspect of selection did not significantly affect the char- 
acter of the interviewers selected although occasionally it did lead to acceptance 
of persons not meeting the qualification requirements. 

(c) A more serious limitation to the general applicability of the results is the 
fact that the Census was taken with a temporary staff, most of this staff having 
no previous experience or training in interviewing. We can be fairly confident, 
however, that the interviewer effects for this population of interviewers would 
be greater than for the usual population of more experienced and better trained 
interviewers. This applies to interviewer variance (differences among inter- 
viewers) and not interviewer bias (the difference of the average of all inter- 
viewers from the “correct value’”’). Experienced or trained interviewers might be 
subject to just as much (if not more) bias as the interviewers used in a Census 
but the effect of their training and experience would be to make them more 
homogeneous in their reactions to the interviewing situation and therefore to 
reduce the variance between interviewers. 

(d) Most of the interviewer’s pay was based on a “piece-rate” (averaging 
about 10 cents per name entered on the Census schedule). Opponents of piece- 
rate pay maintain that it puts the interviewer’s emphasis on quantity with a con- 
sequent erosion of quality. If the loss of quality due to the piece-rate system 
is appreciable, there should be evidence of “padding’’ the list of names on the 
Census schedule. As far as can be determined, improper additions to the Census 
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a_i ; less than 0.1 per cent of the population of the United States as a 
whole |4}. 

(e) The ED data used in the analysis of this experiment have been edited by 
the regular processing procedures used for the Census. The effect of this is to 
eliminate some of the more obviously inconsistent entries made by the inter- 
viewer. This is especially true for items relating to sex, age, and relationship 
where the processing instructions were to insert missing entries and as nearly as 
possible to correct those entries which were obviously inconsistent. 

(f) It should be recalled that ED’s containing large special dwelling places 
(hotels, institutions, etc.) have been deleted from the analysis. These ED’s are 
usually of above-average difficulty and might be expected to contribute sub- 
stantially to interviewer variance. 

While the effect of factors (e) and (f) is to understate the amount of variance 
due to the interviewer, the other factors would, in general, operate in the direc- 
tion of an overestimate and probably would more than balance any underesti- 
mate due to (e) or (f). 


5. EXISTENCE OF SIGNIFICANT INTERVIEWER VARIANCE CONTRIBUTIONS 


Interviewer variances have been analyzed for the 104 ED characteristics 
shown in Table 643. In addition to the usual Census statistics, certain specially 
tabulated items related to interviewer performance are shown. 

Because the within-assignment area variances differed from one assignment 
area to another, the computed F-ratios have the F-distribution with an un- 
known number of degrees of freedom which is, however, at least 125 and not 
more than 580. Since the F-ratios are based on interviewers who did two or 
more ED’s and since relatively few (79) interviewers did more than two ED’s, 
the number of degrees of freedom is nearly the same for the within and between 
variances. Taking the most conservative position (i.e., using 125 degrees of 
freedom for both variances), F-ratios of 1.35 or more are significant at the .05 
level and F-ratios of 1.52 or more significant at the .01 level. 

In evaluating results from experiments such as this, one should remember 
the difference between the concept of “significance” and “‘importance”’ of the 
interviewer variance contribution. Our experiment can tell us that there are 
components of the total variance of a census-type survey that are attributable 
to the interviewer. Furthermore, since we have assigned the ED’s to inter- 
viewers at random we can perform a valid statistical test. On the other hand, 
because we have a large experiment, even if the size of the interviewer’s con- 
tribution to the variance is significant, it may or may not be big enough to be 
considered “important” relative to other sources of variation. The F-ratios 
presented here test the “significance” of interviewer variation, not its “im- 
portance.” 

In another paper being prepared as a part of this same study, estimates of the 
interviewer variance for different-sized interviewer assignments under condi- 
tions of a Census are given. It should be emphasized, however, that the com- 
ponent of variance due to interviewers in a survey is dependent not only upon 
the variance levels (the quantities which primarily affect the F-test) but also 
upon the sample sizes and design. With a fixed interviewer assignment the 
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effect of the interviewer variance on sample survey results will decrease as the 
number of interviewers involved increases (of course since this implies an in- 
crease in sample size, the total variance will also decrease); with a fixed num- 
ber of interviewers, the effect of interviewer variance on survey results will 
decrease as the size of interviewer assignment decreases even though this means 
a decrease in sample size and a consequent increase in total variance. 

We seem to have three general categories of ED characteristics for which 
the computed F indicates significance beyond the level of .01. They are: 

(1) Measures of the number of improperly omitted census entries, i.e., num- 
ber of NA’s (not answered entries) reported for a Census «:-estion. 

(2) Census questions containing one or more of the following factors—(a) in- 
terviewer “resistance” to the question, i.e., a tendency on the part of the inter- 
viewer to be hesitant about making the inquiry and possibly a tendency to 
omit or alter the question or assume the answer, or (b) a relatively high degree 
of ambiguity, subjectivity, or complexity in the concept or wording of the in- 
quiry, or (c) the degree to which additional questioning tends to alter respond- 
ent replies. 

(3) Items for which the distribution of the computed F-ratio is very likely to 
be quite different from the F-distribution. 

The NA rate is a composite of many facets of the interviewer: Is he careless? 
Does he remember that the question applies to the respondent? Does he have 
the desire to complete the question? Does he understand the instructions? We 
, know from experience that these factors and several others vary among inter- 
viewers so that significant variations are to be expected. Table 643 shows that 
all of the 11 NA (or “not answered”) items are significant at levels ‘beyond 01 
(Items 44, 46, 51, 52, 59, 74, 78, 82, 86, 90, 93). 

As an exungle of the “reaistance-complexity” type category, consider Items 
54 thru 58, in Table 643, which involve the Census questions on school attain- 
ment. These items were tabulated from the responses to questions 26 and 27 
of the 1950 Census of Population questionnaire shown in Fig. 642. There is 
often confusion between highest grade of school attended and “highest grade of 
school completed,” moreover this question required the interviewer to code the 
response given and, in addition, a relatively complex set of instructions 
governed the type of schooling to be included. To complete this entry then, the 
interviewer had to have the instructions well in mind, and often had to resort to 
probing questions to determine the proper response. 

Question 27 on the Census schedule, “Did he finish this grade?” is a simple 
enough concept but interviewers have considerable resistance to asking it on the 
grounds that the question of school attendance has already given the answer. 
As shown in Table 643—Items 52 and 53—the F-ratios are high for this item. 
Other examples of items in this category appearing in Table 643 are questions 
on migration (Items 47 through 49), and income items. 

Our computed F’s may not be samples from distributions that are reasonably 
approximated by the F-distribution for some items shown in Table 643. This is 
likely to be true when the proportion we are estimating is very small and 
especially so when the statistic tends to be clustered in a few I!D’s. In such 
eases the distribution of assignment area totals made from the E:.D’s in these 
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areas will be poorly approximated by the normal distribution. In the case of 
“Other races,” Item 4, we find the proportion to be small for a statistic where 
there is high clustering within ED’s. We find for this item incidentally, that 
from the 125 assignment areas, two areas contribute 53 per cent of the “be- 
tween” variance and three areas contribute 45 per cent of the “within” variance. 


What id 
is the ini 
highest this 


Item 26: CODES for GRADEATTENDED Code 
None ... oO 
Kindergarten K 
ELEMENTARY, HIGH 
ement $8 
ELEMENTARY, JUNIOR-SENIOR HIGH 
GE 
4 C1, C2, C3, C4 
yar 
(1 year or more) .. C5 


Fig. 642. Questions 26 and 27, on school attainment, 1950 Decennial Census schedule. 


Another factor affecting some of the F-ratios was a field check to correct 
biases in the within-ED sampling rate. The interviewers’ instructions were to 
list interviewed persons on succeeding lines on the population schedule, using 
every line on the schedule. The persons listed on predesignated sample lines 
then fell in a sample designed to produce supplementary information for 20 
per cent of the interviewed population. Where the crew leader observed im- 
properly omitted lines on the schedules for a completed ED, he, in effect, drew 
a new sample of persons for a portion of the ED and required the interviewer to 


| 
ar 
school 
that 
he has 
at- 
tended? 
| 


INFLUENCE OF INTERVIEWER ON SURVEY RESULTS 


TABLE 643 
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F-RATIOS MEASURING PRESENCE OF SIGNIFICANT INTERVIEWER 
CONTRIBUTIONS TO THE VARIANCE OF ESTIMATES 
OF PROPORTIONS IN A CENSUS* 


Characteristic as pro- 


Characteristic (A) portion of: (B) | (A)/qB) | Ratio 
Nativity 
1 Native white Total population -906 1.10 
2 Foreign born white Total population .037 1.42 
Race 
3 Negro Total population -057 1.10 
4 Other races Total population -00068 1.36 
Residence 
5 Farm population Total population .137 1.34 
6 Nonfarm males All males -854 1.35 
7 Under 5 years Nonfarm males .120 1.19 
8 65 and older Nonfarm males -081 .978 
9 Nonfarm females All females -872 1.32 
10 Under 5 years Nonfarm females -110 1.21 
11 65 and older Nonfarm females .004 1.15 
Ag2 
12 Population under 1 year* Total population -023 1.02 
13 Population 1 year and older* Total population .973 -954 
14 Population under 14 years Total population -260 1.31 
15 Population 25 and older* Total population .583 1.24 
16 Population 55 and older* Total population .176 1.13 
Sex and Age 
17 Males Total population .492 1.05 
18 65 and older All males -086 1.02 
19 55 and older All males .172 1.07 
20 45 and older Al males -282 1.18 
21 35 and older All males .422 1.24 
22 25 and older All males . 582 1.33 
23 21 and older All males 641 1.27 
24 15 and older All males -717 1.29 
25 14 and older* Pop. 14 and older* 484 1.17 
26 5 and older All males .884 1.20 
27 Under 5 years All males -116 1.20 
23 Females Total population .508 1.05° 
29 65 and older All females .094 1.13 
30 55 and older All females .181 1.16 
3% 45 and older All females .288 1.24 
32 35 and older All females .424 1.28 
33 25 and older All females .589 1.38 
34 21 and older All females 653 1.31 
35 15 and older All femaies .734 1.29 
36 14 and older* Pop. 14 and older* -516 1.17 
37 5 and older All females .891 1.22 
38 Under 5 years All females .109 1.22 
Relationship 
39 Heads 14 and older Total population -295 1.15 
40 Heads not in 20% sample, nonwhite or farm All heads .138 1.26 
41 Heads not in 20% sample, white and nonfarm All heads -666 1.30 
42 Families and unrelated individuals* Total population .326 1.17 


* Based on 20 per cent sample of persons within ED's. 
* Based on 705 enumerators, 1489 ED’s in 125 assignment areas in 21 counties in Ohio and Michigan, 1950 
Census. 
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TABLE 643 (continued) 


Characteristic (A) 


Characteristic as pro- 
portion of: (B) 


Proportion 
(A) /(B) 


Marital Status 
Females, married, spcuse present* 


Veteran Status, Males 14 and Older 
World War II veteran status NA* 
World War II veterans* 


Migration, Population 1 Year and Older 
1949 residence NA* 
Different county or abroad 1949* 
Different house, same county 1949* 
Same house as 1949* 


School Attainment, Pop. 25 and Older 
Reporting highest grade attended* 
Highest grade attended NA* 
Highest grade completed NA* 
Highest grade not completed* 
Attended grade 13 or higher* 


Attended grade 12 or higher* 


Attended grade 9 or higher* 


Attended grade 8 or higher* 


Attended grade 5 or higher* 


School Attendance, Pop. Age 5-24 
Now attending school, NA* 


In Labor Force Pop. 14 and Older 
Total* 
Males* 
Females* 
Males age 14-19 attending school* 
Females age 14-19 attending school* 


Major Occupation Groups, Employed Persons 


Unable to Work, Both Sexes 
Pop. 14 and older* 
Pop. 55 and older* 


Females 14 and older* 


Males 14 and older* 
Males 14 and older* 


Total population 
Total population 
Total population 
Total population 


Total population 
Pop. 25 and older* 
Pop. 25 and older* 
Pop. 25 and older* 
Pop. 25 and older, re- 
porting highest 
grade attended* 
Pop. 25 and older, re- 


Males in labor force* 
Females in labor force*} 


Males in labor force* 
Males in labor force* 
Males in labor force* 
Males in labor force* 


Females in labor force* 
Pop. 14 and older* 


Pop. 14 and older* 
Pop. 55 and older* 


SERRE 


8358 


3 


* Based on 20 per cent sample of persons within Ed’s. 
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Item 
45 -232 1.32 
46 .0091 2.47 
, 47 .047 1.49 
48 .114 1.73 
49 .803 1.52 
50 .570 1.20 
51 .017 1.62 
52 .034 2.99 
53 .162 3.79 
54 
.119 1.53 
55 
porting highest 
grade attended* .362 1.40 
56 ee Pop. 25 and older, re- 
porting highest 
grade attended* .562 1.25 
57 ee Pop. 25 and older, re- 
porting highest 
grade attended* .828 1.29 
58 Pe Pop. 25 and older, re- 
grade attended* .946 1.18 
Tot population | | mmm 
60 Pop. 14 and older* .542 
61 Males 14 and older* .815 
62 Females 14 and older* .285 
63 .022 
64 -030 
Males 
65 Farm laborers unpaid family workers* -0093 
66 Farm laborers paid workers* .016 
67 Farmers, farm managers* .077 
68 Craftsmen, foremen, kindred workers* .210 
Females 
: 69 Self-employed, clerical, sales, kindred workers* .010 || 
Major Industry Group, Employed Persons 
71 Manufacturing* .208 
72 042 
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TABLE 643 (continued) 


Item ‘ Characteristic as pro- i 
No. portion of: (B) | (A)/(B) | *-Ratio 
Income 
Wage and Salary, Pop. 14 and Older 
74 NA* Pop. 14 and older* -050 3.75 
75 None* Pop. 14 and older* 474 .247 
76 Under $2500* Pop. 14 and older* .251 1.56 
77 $2500 and over* Pop. 14 and older* 224 1.22 
Self-employed, Pop. 14 and Older 
78 NA* Pop. 14 and older* .052 3.77 
79 None* Pop. 14 and older* .866 2.47 
80 Uader $2500* Pop. 14 and older* .055 1.43 
81 $2500 and over* Pop. 14 and older* -026 1.33 
Unearned, Population 14 and Older 
82 NA* Pop. 14 and older* -054 3.66 
83 None* Pop. 14 and older* -784 2.66 
84 Under $2500* Pop. 14 and older* . 156 2.56 
85 $2500 and over* Pop. 14 and older* .0063 1.08 
Family Income, All Family Members 
86 NA* Families, unrelated in- 
dividuals* .052 2.21 
87 Under $2000* Families, unrelated in- 
dividuals* .299 1.30 
88 $2000-$4999* Families, unrelated in- 
dividuals* 1.41 
89 $5000 and over* Families, unrelated in- 
dividuals* .167 1.53 
TABULATED Sratistics 
Pop. 14 and Older, Not Working, Not in Armed Forces 
Males 
90 Schedule item 16 blank*> Males 14 and older* .047 2.40 
91 Schedule item 16 not blank*® Males 14 and older* .196 1.85 
92 Not “Never Married”* Males 14 and older* .132 1.19 
Females 
93 Schedule item 16 blank*> Females 14 and older* .036 3.27 
94 Schedule item 16 not blank*” Females 14 and older*} = .714 1.66 
95 Married, spouse present* Females, married 
spouse present* -814 1.14 
Females, 14 and Older, Unpaid Familiy Workers 
96 Main activity “Working”* Females 14 and older* .0040 2.11 
97 Main activity not “Working,” “Yes” to schedule 
item 16*> Females 14 and older* .0033 2.47 
Sum of Number of NA's for Items: 
98 51, 52, 59 (Education)* Total population -058 3.85 
99 90, 93 (Schedule item 16)*> Total population .035 2.86 
100 44, 46, 86 (Vet. Status, Migration, Family Income)* | Total population .084 2.97 
101 74, 78, (Individual Income)* Total population 115 3.92 
102 98, 99, 100, 101 (All NA’s)* Total population -291 4.12 
Difference Between: 
103 Sample heads* and total heads -981° 1.17 
104 Sample pop. under 14* and total pop. under 14 1.006° 1.06 
105 Sample pop.* and total population . 996° 1.00 


> Schedule item 16 (“Did this person do any work at all last week, not counting work around the house?”) was 
to be asked of all sample persons not reporting activity as “Working.” The asking of this question is frequently 
erroneously omitted especially for housewives, the largest category in this group. 

° Ratio of sample statistic to total (sample plus nonsample). 
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call back and pick up the additional sample information. This procedure in- 
creases the homogeneity of interviewers with respect to proportion of persons 
in the sample and probably accounts for the low F-ratics for items that reflect 
primarily ratios of sample population to total population—i.e., Item 105 in 
Table 643 and Item 13 which is practically the same as Item 105 (Item 13 is 
equal to Item 105 minus Item 12). 


6. INTERVIEWER CHARACTERISTICS AS RELATED TO PERFORMANCE 


One objective of the Census Enumerator Variance Study (EVS) was to study 
the effect of interviewers on the total survey variance. A second objective was 
to analyze the relation of differences in interviewer characteristics—demo- 
graphic, informational, attitudinal—to differences in results obtained. 

The procedure for the latter type of analysis was as follows: 

(a) Variables were selected on the basis of some plausible hypothesis. For 
example, it seemed reasonable to hypothesize that entries erroneously omitted 
from the schedule reflect primarily interviewer competence and the omitted 
entry (NA) rate should therefore be related to scores on the Enumerator Selec- 
tion Aid test which was used in eliminating the less competent applicants for 
jobs as interviewers. 

(b) Cross-tabulations of the selected variables were made and the relation- 
ships plotted. These results are shown in Figs. 648 through 653. 

(c) Second degree regression curves were fitted to the obtained relationships, 
using the interviewer performance measures as dependent variables and the 
interviewer characteristics (test scores, age, etc.) as independent variables. 
Using these regressions the variance due to the regression and the variance from 
regression could be estimated and used in evaluating the statistical significance 
of the observed relationships. These results appear in Table 654. 

(d) Where appropriate, auxiliary tabulations between interviewer character- 
istics were made. For example, relationships were found between NA rates and 
several interviewer characteristics and, to aid the analysis of these relation- 
ships, three of the interviewer characteristics (age, interviewer judgments on 
the seriousness of coverage errors, interviewer estimates of average family in- 
come) were cross-tabulated. 


7. INTERVIEWER DATA 


Each of the experimental interviewers completed three questionnaires: one 
before training started, one at the completion of training, and one at the com- 
pletion of enumeration. The questionnaires were designed to measure certain 
attributes of the interviewer, such as his aptitudes, prejudices, previous occupa- 
tional background, preconceptions of characteristics of the population he was 
to interview, expected difficulties to be encountered in enumeration and many 
other characteristics. A number of items were repeated on all three of the 
questionnaires to permit measurement of the effects of training and actual field 
experience. Interviewers also completed the Enumerator Selection Aid test 
required of all prospective interviewers as a condition of employment as well 
as standard personnel forms which give such demographic data as age, educa- 
tion, etc. 

The Enumerator Selection Aid (ESA) test comprised questions on reading 
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comprehension, map reading, and ability to follow Census-type instructions. A 
generous time limit of one hour was allowed to complete this test. The maxi- 
mum possible score on the test was 31 and a score of at least 10 was a condition 
of employment as an interviewer. These tests were administered and scored in 
the field. No systematic check was made on either the scoring or administering. 
Under such conditions, it is, of course, possible for inaccurate or improper scor- 
ing to occur. Some of the tests included in the questionnaire given to each ex- 
perimental interviewer before the start of training were quite similar to the 
ESA items. These questionnaires were administered and scored under much 
more carefully controlled conditions and the interviewers understood that the 
results of these tests would have no effect on their employment. Included in this 
paper are results from classifying interviewers by two examinations from this 
questionnaire: (1) a reading comprehension test, and (2) a test of ability to 
recognize the outlines of geometric figures within a more complicated design 
(used by courtesy of L. L. Thurstone). We shall refer to this test as the “Thur- 
stone Diagrams” test. The same questionnaire also contained a number of 
questions directed at the interviewers’ preconception of the characteristics of 
the population. Those for which results are shown are (3) his expectation of the 
average family income in the United States, (4) the per cent of respondents he 
expected would object to answering questions on income, and (5) the minimum 
number of persons which, if missed in the Census, would constitute a “very 
serious error.” 


8. MEASURES OF INTERVIEWER PERFORMANCE 


The results of the analysis of interviewer characteristics as related to per- 
formance are presented here in the form of charts which show “measures” of 
interviewer performance plotted against interviewer characteristics. We shall 
use the following notation to define two measures of performance: 

Let: 

X4,= the total reported for statistic X for all interviewers of the bth 
class in the Ath assignment area. 
X= p> X44» the total reported by all interviewers of the bth class 
in all assignment areas. 
Xa= Li Xas the total reported by interviewers of all classes in the 
Ath assignment area. 
Za», Z., Z4=have similar definitions for statistic Z. 
ka»=the number of interviewers of the bth class in the Ath assign- 
ment area. 
k= Xa kas the total interviewers of the bth class in all assignment 


Then the measure 
P,=X,/Z, represents the over-all proportion of statistic X to Z produced by 


interviewers of the bth class, and 
1 > 
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Fic. 648. Measures of interviewer performance, interviewers classified by sex and age. 


when expressed as a per cent (i.e., 100D,) is the average number of per- 
centage points by which the percentage obtained by an interviewer of 
the bth class differs from the percentage obtained by all interviewers in 
the same assignment areas. 


For most of the ED characteristics covered by the EVS it is not possible to 
state that one result represents “good” performance and another result “bad” 
performance. We can state that Interviewer A’s performance differed from 
that of Interviewer B but we do not know whether the direction of difference 
was towards “better” or towards “worse.” For a few characteristics, a direction 
can reasonably be assigned. For example, for NA rates a high rate can rea- 
sonably be considered bad and a low rate good. For within ED sampling rates, 
rates near 1/5 (the expected rate) can be considered (subject to variance) as 
“good” and big deviations from 1/5 as bad. Unfortunately, sampling rates be- 
cause of the editing procedures already mentioned, tend to reflect crew leader 
and supervisory performance rather than interviewer performance. 

Further problems are introduced by the need to pool results for interviewers 
from different assignment areas in order to obtain enough cases to be able 
to detect relationships. One aspect of this problem is brought out quite clearly in 
Fig. 648. Here the solid line represents P, which is the ratio of females in the la- 
bor force (X,) to all females 14 and older (Z,) obtained by pooling interviewers 
in the “bth” sex-age class. The solid line shows the largest value of this propor- 
tion for any class of male interviewers is less than the lowest value for any age 
class of female interviewers. This appears to be a striking result until we observe 
that about one-half of the male interviewers compared with only ove-third of 
the female interviewers enumerated in rural ED’s. The dotted lines in Fig. 648 
show D, representing measures from which the effect of differences between 
assignment areas has been removed. When we add the curve for D, it becomes 
clear that the difference is a difference between type of areas enumerated 
rather than a difference between interviewers—i.e., primarily a reflection of the 
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higher proportion of females in the labor force in urban areas. All of the other 
charts in this paper present D, values only. 


9. RELATIONS BETWEEN INTERVIEWER CHARACTERISTICS AND PERFORMANCE 


Figs. 649 through 653 show the relationships of selected ED characteristics 
with interviewer performance for interviewers classified by three types of inter- 
viewer characteristics: (a) Test scores measuring aptitudes; (b) Age, and (c) 
Interviewer expectation. 

The examples involving interviewer test scores appear in Figs. 649 through 
650a. 

Fig. 649 indicates the relationship between the interviewer’s Enumerator Se- 
lection Aid test score and the results obtained by the interviewer for three ED 
characteristics. Two of the ED characteristics—ratio of improperly omitted 
entries (NA’s) to total ED population and proportion of NA’s on the individual 
income questions—are such that lower values can reasonably be taken to 
mean good interviewer performance. There was reason to believe that the third 
ED characteristic shown in Fig. 649 (proportion of females 14 and older in the 
labor force) is also sensitive to good and bad interviewer performance. There 
appears to be a very definite correlation between the two NA rates and ESA 
score—interviewers with higher (presumably “better”) ESA scores showing a 
lower average NA rate—i.e., a “better performance” with respect to complete- 
ness of the questionnaires turned in. On the other hand, the relationship be- 
tween ESA score and proportion of females in the labor force is not as obvious 
and is, in fact, not statistically significant (see Table 654). These results are 
quite in line with reasonable expectations. It is reasonable to expect more 
complete questionnaires from the more competent interviewers (those with 
higher ESA scores) and, in fact, any variation in NA rates should be almost 
entirely a function of interviewer performance. On the other hand, while there 
may be basis for an argument that a high proportion of females in the labor 
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Fic. 649. Weighted area differences by class for 
interviewers classified by ESA test scores. 
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= 


Fia. 650. Weighted area differences by class for interviewers 
classified by reading comprehension test scores. 


force represents “good” interviewer performance and a low proportion rep- 
resents “bad” performance, most of the variation in the proportion of females in 
the labor force shown in Fig. 649 is probably due to factors other than inter- 
viewer performance. 

Figs. 650 and 650a illustrate how tests with relatively narrow content as com- 
pared to the ESA test are similarly able to separate interviewers into groups 
having what appear to be recognizable differences in interviewer performance. 
The NA rate shows a clear cut relationship to both Reading Comprehension 
score (Fig. 650) and Thu’ ‘tone Diagrams score (Fig. 650a)—interviewers with 
the higher scores having definitely lower NA rates. The relationship of these test 
scores to two other “measures” of interviewer performance is by no means as 
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Fie. 650a. Weighted area differences by class for interviewers classified by scores 
on geometric perception test (Thurstone diagrams). 
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Fie. 651. Weighted area differences by class for interviewers classified by age. 


definite. There appears to be some small relationship of proportion of females in 
the labor force to both the Thurstone Diagram scores and the Reading Compre- 
hension scores. The relationship of proportion of families with incomes over 
$5000 to these two measures is more tenuous and is not statistically significant. 
To the extent that there is any relationship it tends to agree with the hypothesis 
that higher family income and higher proportions of females in the !abor force 
indicate better performance and therefore would correlate positively with other 
measures of interviewer competence—i.e., the test scores. 

Fig. 651 indicates that younger interviewers have a better performance as 
measured by the NA rate. Some survey supervisors, where feasible, try to hire 
interviewers between the ages of about 25 through 45. The usual reason given 
for such a policy is the lack of maturity of the very young applicants and the 
lack of alertness and physical stamina of the older ones. Fig. 651 might be cited 
as evidence justifying this policy (as shown by the higher NA rates for the 
youngest and oldest groups). It is quite likely, however, that the results shown 
in Fig. 651 are an effect of selective hiring rather than a justification for it. The 
condition that a large number of interviewers be hired for short term employ- 
ment is likely to result in a larger selection from the group of persons ordinarily 
not in the labor force than would occur for a survey furnishing regular employ- 
ment and requiring fewer interviewers. It is also reasonable to suppose that the 
more capable people will tend to have better jobs as they grow older and thus, 
in effect, be eliminated from selection as interviewers. In any event, the rela- 
tionship of performance measures to age seems to be largely a function of differ- 
ences between interviewer age groups with respect to other characteristics. 
For example, a cross tabulation of interviewer’s age with ESA test scores indi- 
cates that the 5-year age group having the highest mean ESA score is the age 
group 30 to 34 which is also the age group showing the lowest NA rate in Fig. 
651. The relationship of interviewer age to number of respondents reported as 
not completing highest grade of school attended seems to follow patterns similar 
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Fie. 652. Weighted area differences by class for interviewers 
classified by expectation of average family income. 


to the NA rate relationship but without the difference between interviewers 
younger than 30 and those 30-33. This relationship tends to conform the hy- 
pothesis that the older and (apparently) less competent interviewers may resist 
asking this question and check the “yes” answer to Census question 26 (Fig. 
642) without really verifying their information. 

Examples of performance for interviewers classified by expectations before 
the start of training are shown in Figs. 652, 652a, and 653. The classification of 
interviewers by expectation of average family income as shown in Fig. 652 indi- 
cates that actual reporting of family income tends to follow expectations, i.e., it 
suggests that interviewers may bias results in the direction o: <heir expectations. 
This is in line with the results obtained by Hyman [6]. The relationship is, 
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Fie. 652a. Weighted area differences by class for interviewers classified by expected 
per cent of respondents objecting to income questions. 
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Fic. 653. Weighted area differences by class for interviewers classified by number 
of missed persons constituting a “very serious error.” 


however, not statistically significant for either the proportion of families with 
incomes over $5000 or the proportion with incomes under $2000. The relation- 
ship between interviewer’s estimates of average family income and NA rate 
(also shown in Fig. 652) is also not statistically significant. 


Fig. 652a presents the rates for all NA’s and NA’s on personal income ques- 
tions for interviewers grouped by the per cent of respondents they expected 


FIG. 653a. QUESTION ANSWERED BY EXPERIMENTAL INTER- 
VIEWERS BEFORE START OF TRAINING 


“How serious would you consider it to be if the following number of people were 
missed in the count of the population throughout the United States?” 


Number of people A B Cc 
not counted Very serious Mildly serious 


100 


1,000 


10,000 


100 ,000 


1,000 ,000 


5,000 ,000 


10 ,000 ,000 


20 ,000 ,000 


(Only the entries in column A were used in classifying interviewers for Fig. 653.) 
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would object to income questions. The chart shows that these NA rates gener- 
ally tend to increase with the interviewer’s expected trouble on income ques- 
tions. The high NA rates for those expecting zero to one per cent of the re- 
spondents to object to income questions do not seem to have an obvious 
explanation; it is not a function of interviewer ability—to the extent that the 
ESA scores measure interviewer ability—since the average ESA score for these 
interviewers is about the same as the general mean. Our best guess is that this 
class may represent those interviewers with a somewhat vague picture of the 
real world. For the other expectation classes the results conform to the hy- 
pothesis that interviewers get what they expect. 

Fig. 653 gives the rates of all NA’s and of personal income N A’s for interview- 
ers classified by their responses to the question given in Fig. 653a. This question 
was originally conceived as an attempt to measure the interviewer’s perspective 
on the problem of coverage in the Census. 

Although the correlation between the judgments on seriousness of coverage 
errors and NA rates is not statistically significant (or of borderline significance) 
it is interesting to note that the expectations are correlated with ESA scores, the 


TABLE 654 


F-RATIOS SHOWING SIGNIFICANCE OF THE REGRESSION* OF 
MEASURES OF INTERVIEWER PERFORMANCE ON INTER- 
VIEWER CHARACTERISTICS 


Interviewer performance measure: ratio of A to B 
(Dependent Variable) 


Females 14 and older 


8 S28 


Se 855 Be 


Number of missed persons consti- 
tuting a “Very Serious Error” 


on 


* Second degree least squares regression curves. 
at 1%. 


(Independent Variable) 
A B 
Enumerator Selection Aid Test | Females in labor force PC 2.43 
scores (ESA) Income NA's Total population 16.33 ; 
All NA’s Total population 21.91 
Reading Comprehension Test | Females in labor force Females 14 and older 
scores Family income $5000 and over | Families and unrelated individuals 
All NA’s Total population 
Thurstone Diagrams Test Females in labor force Females 14 and older 
scores Family income $5000 and over | Families and unrelated individuals 
All NA’s Total population : 
Age of interviewer Population 25 and older not com- | Population 25 and older 
pleting highest grade of school 
attended 
Income NA's Total population 
All NA’s Total population 
Expectation of average family | Family income $5000 and over | Families and unrelated individuals , 
Family income NA Families and unrelated individuals 
Expected per cent of respond- | Income NA’s Total population 
ents objecting to income All NA’s Total population 
questions 
Total popaation | 
All NA’s Total population 
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expectation group with the highest mean ESA score being also the group with 
the lowest NA rate. We also note that the interviewers who thought that 100 to 
1000 missed persons would be a “very serious error” have high NA rates and low 
ESA scores. Here again we may be dealing with interviewers having a very poor 
concept of Census realities. 

The analysis above deals with interviewer expectations measured before the 
start of training. It may be important to compare these expectations with the 
interviewers’ responses to the same questions asked at the completion of train- 
ing and at the completion of enumeration. Such data were obtained by the 
Enumerator Variance Study and are now being analyzed. 
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DEMAND FOR FARM PRODUCTS AT RETAIL AND THE 
FARM LEVEL 


SOME EMPIRICAL MEASUREMENTS AND RELATED PROBLEMS* 


Rex F. Daty 
Agricultural Economics Division 


Expenditures for food at the retail level are fairly responsive to 
changes in consumer income. But use of food products as they come 
from the farm are relatively unresponsive to price and income changes. 
This paper, using a set of data on retail expenditures for food, the 
marketing bill, and the farm value, presents methodology and empirical 
measurements of price and income elasticities of demand, the flexibility 
of expenditures relative to income, and interrelationships among these 
elasticities. 

E know a considerable amount about the factors that affect demand for 

farm products. But there is much that we do not know about the concept 
and measurement of demand at the farm level and about the influence of the 
supply response on expenditures for farm products. To what extent do price 
changes influence expenditures for farm products and the quantity of farm 
products consumed? What is the nature of the demand for the services of proc- 
essing and marketing farm products? How do shifts in demand for final 
products influence returns to farmers and changes in farm output, in resource 
organization, and in farm inputs? This paper explores some of these questions 
and presents some simple analyses of the demand for food products and related 
marketing and processing services as well as the demand for products as they 
come from the farm. An attempt is made to isolate the influence of price and 
quantity variation on expenditures, to compute price and income elasticities of 
demand, and to relate the supply response to price, quantity, and income- 
expenditure elasticities. 

Domestic demand for farm products depends primarily on population growth 
and changes in income as they modify the pattern of consumption. Industrial 
activity and construction also affect the demand for forest products, cotton, 
paints, oils, and several other farm products. Many other factors, including 
custom, food fads, nutritional findings, and medical considerations, also influ- 
ence changes in consumption habits. The foreign market, too, is a major outlet 
for such commodities as cotton, grains, tobacco, and fats and oils. 


CONSUMER EXPENDITURES FOR FOOD AND OTHER FARM PRODUCTS 


Consumers do not buy farm products as such; they buy food, clothing, 
tobacco, paints, soap, alcohol, and other products made from farm commodities. 
The products, furthermore, are not bought at the farm. Many farm commodities 
are highly processed, most are packaged, and all must be assembled, packed, 


* Paper presented at the joint meetings of the American Statistical Association and the Econometric Society, 
September 12, 1957, at Atlantic City, New Jersey. Most of the ideas expressed in this paper are not new; to the 
author, at least, some of them have had a gestation period of several years. The author wishes to acknowledge the 
contribution, both of association and counsel, of many co-workers in the Agricultural Economics Division, U. 8. 
Department of Agriculture, particularly Frederick V. Waugh, Nathan M. Koffsky, Carroll Downey, Anthony 
Rojko, Marguerite Burk, and Karl Fox. 
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shipped, and made available in urban centers. In 1957 the farmer received less 
than 40 cents out of the consumer’s retail food dollar, and the share is much less 
for non-food products. In most years the farmer receives only about a third of 
the consumer dollar spent for food, clothing, tobacco, and other products of 
of the farm. 

Population growth is a direct and major influence on demand when incomes 
and economic activity are maintained at a high rate. But a change in consumer 
spending for food and farm products is dependent to a considerable extent on 
consumer income. In recent years consumers as a whole have spent about a. 
fourth of their income for food—a 10 per cent increase in income has resulted 
in about a tenth larger outlays for food. This implies an elasticity of expendi- 
tures for food relative to income of about 1.0. Consumer expenditures for 
tobacco are also closely related to changes in income. In the postwar years 
(since 1945) consumer outlays for clothing seem somewhat less responsive to 
income changes than in earlier years. There is, of course, enough variation in 
these relationships to show that factors other than income affect consumer 
spending for farm products. 

Retail expenditures for food were expressed as a function of income for the 
periods 1929-41 and 1948-56 with the following results: 


log V = — .384 + .906 log Y 
(.027) 
r= .991 (1) 


In this simple correlation, (V) represents per capita expenditures by civilians 
for farm produced foods and (Y) per capita disposable income. The figure in 
parenthesis is the standard error of the regression coefficient. An examination of 
the income-expenditure scatter diagram for these data suggests a slightly smaller 
income elasticity for each of the two periods than for both periods combined. 


Price, Quantity and Expenditures Relative to Income 


Per capita expenditures for food or any other commodity have a price and 
quantity component. And at the retail level, value is made up of the margin 
for processing and marketing as well as the farm value. Data recently available 
on civilian expenditures for food' permit an examination of some of these rela- 
tionships and their implications for price and income elasticities at the farm 
level. 

Measures of the quantity component of value include the “quantity” of 
marketing and processing services as well as pounds of food as they come from 
the farm. Income also has a price component and a quantity or real com- 
ponent. For purposes of the analyses to follow, per capita expenditures and 
income are designated as follows: 


Food V=V,-V, 
Income Y = Y,-Y, 


1 See Income-Food Relationships from Time Series and Cross Section Surveys, by Marguerite C. Burk, a paper 
presented at American Statistical Association meetings, September 1957. This paper describes the series used in 
this analysis. 


658 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1958 
In order to isolate the contribution of price and qvantity changes to value 
and to examine their relationship to income, the pertinent price and quantity 
components are each expressed as a function of their expenditure and income 
aggregate.’ 
Food 
(log =~ a+ blog V 
(log V, = .087 + .775 log V 
(.017) 
r = .995 (2) 
(log V, = a +b log V 
(log V, = 1.908 + .227 log V 


(.016) 
r = .950 (3) 
Income 
(log Y, = az + bz log Y 
(log Y, = .347 + .533 log Y 
(.022) 
r = .983 (4) . 


(log Y, = a, + b; log Y 
(log Y, = 1.653 + .467 log Y 
(.022) 
r = .978 (5) 


In each of the above analyses the regression coefficients are statistically sig- 
nificant by the usual measures. Since V,-V,=V, the sum of the regression co- 
efficients in the price and quantity equations equal 1.0. The constant term is a 
level adjustment which gives the characteristic for the log of the value product 
(pXq). For the period analyzed, more than three-fourths of the variation in 
retail expenditures for food was associated with price change while less than a 
fourth represented changes in quantity. However, for per capita income, around 
47 per cent of the change represented a change in real income; the remaining 
53 per cent was due to price variation. 

Equation (1) above suggests that a 10 per cent increase in incomes has re- 
sulted in an average increase in per capita expenditures for food of more than 
9 per cent—an income elasticity of expenditures of .906. This is a sort of average 
elasticity for the period analyzed. As will be demonstrated, when supplies are 
large relative to demand, a low price elasticity of demand may result in food 


2 The quantity of food at retail (V,) is the sum of (M,) and (F,) and the price of food at retail (V>) is that im- 
plied from V/V_. The marketing and processing murgin quantity (M,) was computed by dividing the marketing 
bill (M) by the index of the marketing margin for the “market basket of farm foods” (My). The quantity at the 
farm level (Fz) was computed by dividing farm value (F) by the index of farm value in the “market basket of farm 
foods” (Fy). The latter is a price index which follows closely the index of prices received for all farm products. 
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expenditures rising less than income or even declining as incomes rise. Although 
retail expenditures will tend to be more closely related to income than farmers’ 
cash receipts, big changes in supplies relative to demand can modify this rela- 
tionship.* 

When price influences are eliminated from both food expenditures and in- 
come, the elasticity of quantity relative to income is much smaller than for 
dollar expenditures and also is smaller than the flexibility of price relative to 
income. Results in equations (1), (3), and (5) indicate that a 10 per cent increase 
in real per capita income has resulted in an average increase of around 44 per 
cent in the quantity of food purchased per person. This is less than half the 
response of per capita dollar expenditure for food relative to current dollar in- 
comes. 


d log V, 
dlogV dlogV 
dlog Y, dlogY 
d log Y 


.227 
"467 (.906) = 44 


(6) 


= 


The flexibility of price components of food and consumer income—(Ey,-y,), 
which is food price relative to the consumer price index—indicates that a 10 
per cent change in the consumer price index has usually been accompanied by an 
approximate 13 per cent change in average food prices. 

A simple demand equation was also computed from the above data in order 
to get a direct computation of price elasticity of demand as well as income 
elasticity of demand when both price and income are explicit in the equation. 
The required new variable was food prices adjusted for the general price level— 
(V,) deflated by the consumer price index (p). 


log Ven a+ dlog— + clog 


V. 
log V, = 1.3658 — .261 log — + .517 log Y, 
(17) p 
Ri .23—.94 (7) 


The income elasticity of demand (c) compares with 0.44 in equation 6, which is 


# In order to illustrate the “tie in” of separate price and quantity elasticities in equations 2 to 5, the income 
elasticity in equation (1) also may be represented as the following identity which combines related price and quan- 
tity contributions to expenditures and income. 


dlog Vp dlog 
+ dlog V 
d log V dlog V dlog V 


dlog | dlogYp dlogY, 
+ dlog Y 
diog Y dlogY 
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within one standard error of the above estimate. The price coefficient looks 
reasonable although the standard error is relatively large. 


Cross-Section Surveys 

The above elasticities are fairly close to comparable results from cross section 
surveys of consumer expenditures for food relative to income levels. Such a 
survey of urban consumers taken in 1948 shows an income elasticity of about 
0.4 for all food expenditures and 0.9 for food expenditures away from home.‘ 
The latter is primarily expenditures for services of restaurants and other eating 
places. Since the survey was taken at a given time, there was no price variation 
in either spending or income. 

A similar survey taken in 1955 showed an income elasticity of expenditure 
for purchased food of 0.44 (this included food eaten away from home and 
alcoholic beverages). Money value of purchased and non-purchased food con- 
bined relative to income indicated an elasticity of 0.33; a smaller response to 
income than for purchased food would be expected. Both these relationships 
were very close to linear on double logarithmic paper, indicating a fairly con- 
stant slope at all income levels. Very high correlations and small standard errors 
of the coefficients show small deviation from the average income-expenditure 
relationship. 

DEMAND FOR MARKETING AND PROCESSING SERVICES 


Possibly two-thirds or more of consumer expenditures for goods derived from 
domestically produced farm products consist of the bill for marketing. These are 
payments to agencies that assemble, process, and perform other functions re- 
quired to get farm products to the consumer in the form, time, and place desired. 
Labor costs in processing industries and in wholesale and retail trade, together 
with transportation charges, made up 60 per cent of the 1956 marketing bill for 
farm produced food. Other charges, including fuel and power, packaging mate- 
rials, and machinery made up 34 per cent. Profits before taxes accounted for 6 
per cent; about 3 per cent after taxes. Each major cost element in the market- 
ing bill has more than tripled since 1939. These costs continue to rise and in 
general are very unresponsive to downward adjustments in either economic 
activity or the general price level. Thus the farmer is usually the first to feel 
the impact of a reduction in consumer spending for food and other products of 
the farm. 

In recent years the marketing bill for farm-produced food has totaled almost 
60 per cent of expenditures for food. In 1957 the farm share was about 39 per 
cent of the retail cost of food. For livestock products in general the farm value is 
a larger share of retail cost than is the share for bakery products, which require 
considerable processing, or for fruits and vegetables, which must be transported 
long distances in refrigerated facilities. 

Farm commodities are little more than a raw material in most nonfood prod- 
ucts. In textiles, leather, tobacco and alcoholic beverages the farm value 


* Faith Clark, et al. Food Consumption of Urban Families in the United States, Agricultural Information 
Bulletin No. 132, USDA, 1954, p. 39. 

* Food Consumption of Households in the United States, 1955 survey, Report No. 1, USDA, 1956, p. 7. Per 
eapita income after taxes for families of 2 or more was related to per capital purchases of food. The open ends of the 
income distribution were not used; this left 7 income groups ranging from $1,000 to $10,000. 
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probably averages only about a tenth of consumer expenditures. Thus the farm 
share of expenditures for all farm products in most years probably represents 
less than a third of the total retail expenditure. 


Price, Quantity, and the Marketing Bill Relative to Income 

Expenditures for processing and marketing services used in getting farm 
products to the consumer are responsive to changes in income. A 10-per cent in- 
crease in per capita dollar income has, on the average, resulted in an increase of 
about 9 per cent in the marketing bill. This ratio may differ some in different 
periods with varying rates of unemployment. 

In the period analyzed (1929-41 and 1948-56) about 60 per cent of a change 
in the marketing bill was associated with price change while about 40 per cent 
was due to change in “quantity” (deflated value) of services. Coefficients for 
these analyses, like those in equations (2) and (3), have very small standard 
errors and the correlation coefficients fell in the .97 to .99 range indicating a 
small scatter in the observations. Using the same procedure as the one used 
above for expenditures for food, the computed income elasticity for real com- 
ponents indicates that the “quantity” of services increases by about 72 percent 
with a 10 per cent increase in income (equation 9). Changes in the price com- 
ponent of the marketing bill were about proportionate to changes in the con- 
sumer price index. 


log M = — .632 + .903 log Y 


(.034) 
r = .986 (8) 
dlog M, 
dlog M dlogM 
Merve “Glog dlog Y 
d log Y 
401 
= — (.903) = .775 9 
7“ (.903) (9) 


The demand for marketing and processing services connected with food is 
somewhat more responsive to income changes than is total consumer expendi- 
tures for food. This was to be expected as was the associated implication of 
the smaller response of consumption of farm products at the farm level relative 


to income changes. 


DEMAND FOR FARM PRODUCTS AT THE FARM LEVEL 


Changes in domestic and export demand for farm products and their influ- 
ence on prices and incomes are of major concern to farmers, other business- 
men, and consumers. What effect will a given rise in consumer income or a 
change in supplies have on prices for farm products and farm income? Most sta- 
tistical analyses indicate a very small price elasticity of demand and a very low 
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income elasticity of demand for farm products as a whole. Thus neither changes 
in average prices or variations in consumer income have much influence on 
- domestic consumption of foods in particular. Some domestic nonfood uses may 
be more responsive than food uses to changes in prices and incomes. Consump- 
tion of cotton, which is in competition with synthetics, and exports of such 
commodities as wheat, rice, tobacco, and oils as well as cotton apparently 
are very responsive to fairly large changes in relative prives. 
Price, Quantity, and Cash Receipts Relative to Income 

The farm value (F’) of farm produced food products in recent years has been 
equal to about 40 per cent of consumer expenditures for food. This value was 
highly correlated with consumer income during the period 1929-41 and 1948-56. 
The analysis for this period indicates that on the average a 10-per cent increase 
in consumer income was associated with a little more than 9 per cent increase 
in the farm value of foods. 

log F = — .758 + .913 log Y 
(.044) 
r= .98 (10) 


The fact that the income elasticity of farm value of food is as high as for retail 
expenditures may be a little disturbing. Price and income elasticities of demand 
at the farm level are very low. Thus when supplies are large relative to demand, 
as they have been in recent years, farm prices and cash receipts decline despite 
rising consumer income.® The influence of the supply response on expenditure 
is illustrated in a later section. 

The quantity of farm products produced and consumed domestically changes 
relatively little from year to year. Price fluctuations account for most of the 
variation in farm value of food. 


log F, = — .012 + .931 log F 
(.013) 
r = .998 (11) 


log F, = 2.013 + .069 log F 
(.012) 


r = .78 (12) 


The flexibility of the price component of farm value (F,) relative to the price 
component of consumer income (Y,) shows an elasticity (Er,-y,) of 1.6— 
farm prices of food tend to vary about 1.6 times as much as changes in consumer 
prices. 

The quantity component of per capita farm value (F,) was related to real 
consumer income per capita in the same way as computed above for retail value 
and the marketing margin: 


* In addition the negative constant term in the equation reduces the estimate of farm value more than the 
smaller constant in equation (1) for retail food expenditures. However, for the entire period, price and quantity 
changes have resulted in variations in a farm value very close to variations in consumer income. 
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d log F, 

dlogF dlogF 
dlog Y, dlogY 
d log Y 


069 
= 7 (.913) = .134 (13) 


These relationships indicate that the income elasticity of demand for food (at 
the farm level) is less than 0.15. This suggests that a 10 per cent increase in real 
consumer income per person would usually result in a gain of only about 14 per 
cent in per capita consumption of foods. The rise in consumption would rep- 
resent mostly an upgrading of the diet to higher-cost foods rather than more 
pounds of food. 

A simple demand equation gives a direct computation of income elasticity of 
demand as well as price elasticity of demand at the farm level. Per capita con- 
sumption at the farm level (F,) was correlated with prices received for food ° 
relative to wholesale prices (F,/pw) and real consumer income per capita 
(Y,). An income elasticity of 0.18 compares with the 0.13 implied in the above 
calculation (equation 13); the difference is equal to about 2 standard errors of 
the coefficient. A very low price elasticity of demand also is indicated. 


F 
log F, = a + b log —- + c log Y, 
pw 


F 
log F, = 1.8279 — .121 log —~ + .181 log Y, 
(054) (026) 
Ri .23—.87 (14) 


These results compare with a price elasticity of demand at the retail level of 
—.26 and an income elasticity of demand of nearly .52 as computed in equation 
(7). Estimates of prices received by producers for food based on independent 
supply estimates and income provide a better basis for appraising changes in 
cash receipts than would the farm value-income relationship in equation (10) 
above. 

Measures of the flexibility of consumption at the farm level relative to income 
and price changes (equations 13 and 14) are somewhat lower than elasticities 
most widely reported in the literature.’ Most of these studies were based on a 
retail-price-weighted index of per capita consumption which was not designed 
to measure consumption at a farm equivalent level. Measurements of consump- 
tion at the farm level rather than at some higher level in the distribution system 
should yield a lower income elasticity of demand. This assumes, of course, that 
“quantities” of marketing and processing services are reflected in quantities 


7 See Girshick, M. A.. and Haavelmo, T., “Statistical Analysis of the Demand for Food,” Cowles Commission 
Papers, New Series, No. 24, 1947, p. 109; Tintner, G., “Multiple Regression for Systems of Equations,” Econo- 
metrica, 14: 34-36, 1946. Burk, Marguerite C., “Changes in the Demand for Food from 1941 to 1950,” Journal 
of Farm Economics, 33: 281-98, 1951. Working, Elmer J., “Appraising the Demand for American Agricultural Output 
During Rearmament,” Journal of Farm Economics, 34: 209-15, 1952. 
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consumed. Price elasticity of demand for farm products is generally very small 
at both the retail and farm level. 

Although changes in relative prices and incomes have little influence on com- 
bined per capita use of farm products, they do influence the kinds of goods con- 
sumers want. Retail weight of food consumption per person in pounds appar- 
ently have changed little in the last quarter century. But there have been siz- 
able shifts from grains and potatoes to some other vegetables, fruits, meats, and 
other higher-cost foods. This results in an upgrading of the diet and an increase 
in consumption in an economic sense—increased use of resources. 


INCOME ELASTICITIES FOR CONSUMER EXPENDITURES, THE MARKETING BILL 
AND FARM VALUE 
The flexibility of consumer expenditures for food relative to income appar- 
ently is a weighted average of the income elasticity of the marketing bill and 
the farm value.* Since these weights are shifting over time, some changes in 
elasticities are also implied. Weights used to combine the elasticities are mean 
values of the farm value (F,.) and the marketing bill (M,,) expressed in both 
current dollars and in quantities as approximated from deflated aggregates. 
Income elasticities of expenditures computed above, combined by appropriate 
weights, approximately equal directly computed elasticities of value relative to 
income. 
Food on quantity basis 
434 = (.775)(.467) + (.134)(.533) (15) 
Food on current dollar basis 
Ey.y = (Eu.v)Mw + (Er.y) Fw 
.908 = (.903)(.556) + (.913)(.444) (16) 


When price influences are eliminated, the above analyses indicate that the 
“quantity” of services used in processing and marketing farm products is much 
more responsive to income changes than is per capita consumption of farm pro- 
duced food—possibly as much as 5 or 6 times as responsive. This suggests that 
consumers have wanted, and have been willing to pay for, more and more serv- 
ices of marketing and processing with their food. High incomes result in less 
home production and more “eating out” as well as more packaging, more proc- 
essing, greater variety of products in season and out, and probably better 
quality. Such services in total may require, because of less waste, fewer farm 
products per serving than before. Thus increases in retail consumption may re- 
sult in relatively much smaller changes in demand for products of the farm. 
And because the farm share of many products is small, relatively large price 
changes at the farm level have little influence on retail prices of many products 
of the farm. 


PRICE AND INCOME ELASTICITIES RELATIVE TO THE SUPPLY RESPONSE 
A knowledge of the supply response is especially significant in appraising the 
effect of a demand shift when you know that both price and supply (consump- 


*See “The Long-Run Demand for Farm Products,” by R. F. Daly, Agricwitural Economics Research, July 
1956, p. 78. 
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tion) will change. The partial elasticities of demand theory may have little 
meaning in this case; you seldom get a change in consumption or price but a . 
change in both. And changes in price and quantity have different effects on 
expenditures and returns to farmers depending on price elasticity of demand and 
the supply response. Although price and income elasticities are supposed to 
reflect consumer behavior characteristics, empirical measurements of these 
elasticities from time series data probably are influenced also by the supply 
response particularly when technological developments have been large. 

In some earlier work on relationships of price flexibility and income elasticity 
of demand it was concluded that variation in expenditure (pXq) due to 
quantity changes would approach the smaller elasticity of quantity relative to 
income and changes in expenditure due to price variation would approach the - 
higher price flexibility relative to income. Thus income elasticity of expendi- 
tures for food should range somewhere between the coefficient of price flexibility 
relative to income and the elasticity of quantity relative to income.*® 

Although the conclusion was not immediately obvious, the supply response is 
necessary to determine how much price, quantity, and expenditure will change 
with a given change in income."* The influence of the supply response on partial 
elasticities expressing price changes relative to income and quantity changes 
relative to income (income elasticity of demand) can be better illustrated by 
using a simple demand and supply equation with hypothetical elasticities." 


Demand 1. log g = blogp+clogy 


1 
la. = bee bey (17) 


Supply 2. log q =e log p 
1 
2a. bs p= ¢ 


Let b = — 0.2, c = 0.25, and e = 0.2 (18) 


In the usual demand equation, using assumed coefficients, a 10 per cent shift 
in the demand curve to D, (figure 666) would be associated with either a 124 per 
cent increase in price (p to p;) reflecting the effect of (—c/b) or a 23 per cent 
increase in consumption (q to q:) reflecting the effect of (c). And the expenditure 
product (pXq) would vary accordingly as the change was due to a change in 


® See Some Considerations in Appraising the Long-Run Prospects for Agriculture by R. F. Daly; Long-Range 
Economie Projection, NBER Studies in Income and Wealth, Vol. 16, 1954, p. 148. Thus, 


dlogV dlogp 
+ 


dlog¥ d2iog Y dlog 
10 Tt was possible in the earlier work to check these interrelationships from a given set of data consisting of a 
demand equation, price equation, and value as a function of income. 
1. log g =a +b log p +c logy 
2. log p = a1 + bi log g + 1 logy 
3. log (pq) = a + Blog Y 
Although relationships among empirical coefficients for these equations check out when both price and quantity 
take on functions specified, the first two equations are the same except for variations in coefficients associated with 
changing the dependent variable; doth are demand equations. The logical sequence of developments led to the 
need for a supply response in order to specify the probable change in quantity. 
1 The author is particularly indebted to Anthony 8. Rojko for his help in sharpening the focus on this problem. 
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Hypothetical Relationships 


DEMAND AND SUPPLY 


U. S. DEPARTMENT OF AGRICULTURE NEG. 6158-58 (5) AGRICULTURAL MARKETING SERVICE 


Fie. 666 


price or quantity. When supplies increase rapidly relative to demand shifts, 
cash receipts to producers tend to decline despite rising consumer demand. 

When the supply response is specified, we will get changes in both price and 
consumption. The flexibility of quantity relative to income in this case can be 
shown by substituting equation 2a and 1 with the following result: 


c 
log g = ———— log y 


e 


d log q ce 
dlogy e-—b 


= 125 (19) 


Flexibility of price relative to income, when the supply response is specified, 
can be shown by substituting equation (2) in (la). 


= .625 (20) 


With the change in both price and consumption specified, the elasticity of ex- 


| 
P 
| 
| 
q 929) 
pe 
dlogp 
dlogy b-e 
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penditure (v= p-g) shou!d equal the sum of the elasticities in equations (19) and 
(20) when the supply response is specified. 
b-e e—b 
—¢ + ce 
b-e e-b 


75 = 625 + .125 (21) 


The final adjustment, after accounting for the supply response, is represented 
in the figure by price adjustment p to pz and quantity adjustment q to qe. 

The hypothetical example above greatly oversimplifies the problem, par- 
ticularly of the supply response. The supply curve for farm products probably 
is very inelastic with respect to current prices. Empirical measurements above 
for food would suggest a coefficient as small as 0.1. But over time the entire 
curve shifts to the right at varying rates for different commodities and different 
rates of technological development and these shifts in recent years have resulted 
in output expanding more rapidly than shifts in demand. Nevertheless the 
illustration helps to indicate some of the interrelationships among partial 
elasticities and the actual price and consumption changes likely in response to 
a shift in demand. 

In general, the larger the price coefficient in the supply equation (greater 
the supply response to a change in price) the lower the income elasticity of 
expenditures, i.e., relatively more of the value change is due to quantity and 
less to prices. For example, price and income elasticity of demand for meat ani- 
mals and poultry may be much the same. But the supply response for meat 
animals has been much slower than for poultry. As a result, meat consumption 
per capita rose 12 per cent while relative prices rose by a fourth from 1925-29 
to the 1951-55 average. On the other hand, poultry consumption increased by 
two-thirds and relative prices of poultry declined by one-third over the same 
period. These big changes in prices and consumption of poultry, due to the 
supply response, may account partly for higher price and income elasticities for 
poultry estimated from time series data. Price and consumption trends observed 
above for these commodities suggest that consumer expenditures for meat ani- 
mals especially beef, have been more responsive to income increases than 
have expenditures for poultry. 


SOME IMPLICATIONS FOR AGRICULTURE 


The small response of consumption to changes in relative prices and income— 
a very inelastic demand—has important economic implications for agriculture. 
Rising consumer incomes exert a small influence on per capita use of farm prod- 
ucts. Moreover, the very low price elasticity of demand results in considerable 
instability in grower prices and incomes when farm output differs much from 
the normal market outlets for farm products. For example, under given de- 
mand conditions, a 3 or 4 per cent increase in per capita supplies of food for 
domestic use may lead to a decline of around a fifth in grower prices. Farm 
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incomes would also decline as the reduction in prices would more than offset 
larger marketings. These well known characteristics have influenced farm 
policies designed to stabilize farm prices and incomes. However, the small 
response of consumption to price and income changes does not necessarily 
spell doom for agriculture. As shown above, for the period 1929-41 and 1948-56, 
both retail expenditures for food and farm value were highly correlated with 
changes in consumer income. An inelastic price and income response exerts 
great upward pressure on prices when supplies are relatively short. 

We may be surprised at the price rise which would result from a reduction 
in per capita food supplies of 3 to 4 per cent. For a hungry man, food has no 
substitute and when supplies are short prices rise abruptly. Since all foods 
compete with each other, to some extent, price and income elasticities for in- 
dividual foods would tend to be less inelastic than for foods as a whole. Thus the 
price effect of a cut in per capita supplies of individual foods would be less 
than a similar cut for all foods. 

The longer run prospects for agriculture hinge largely on the relationship 
between a relatively slow expansion in demand and the supply response in 
agriculture. Output has expanded rapidly in the last 15 to 20 years. New techno- 
logical developments have materially increased yields and inputs such as 
machinery, fertilizer and insecticides have rapidly increased output per man. 
Currently and in recent years farm output has exceeded domestic use plus 
record exports swelled by surplus disposal programs. Production continues 
high and carryover stocks of wheat, cotton, tobacco and feed grains are exces- 
sive. Thus large supplies will complicate adjustments in agriculture for several 
years. 

Prospects for the longer-term future are usually strongly influenced by cur- 
rent developments and the recent past, but even discounting current large 
supplies, many economists feel that we will have no difficulty in meeting prob- 
able increases in requirements for farm products. However, if population 
growth continues rapid, expansion in farm output probably would need to be 
somewhat more rapid than long-run growth. Some acceleration of the trend 
in farm output, after excess stocks have been worked down, probably would not 
exert great pressure on production facilities in agriculture. However, it is un- 
likely that supply increases in the next two decades will be so large relative to 
expansion in demand as to result in persistent disastrously low prices and in- 
comes for agriculture. 
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INVESTMENT ESTIMATES OF UNDERDEVELOPED COUNTRIES: 
AN APPRAISAL 


I. AprRABAM* 
United Nations 

Considerable use is being made of the estimates of capital formation 
now available for a large number of underdeveloped countries. Users 
of these statistics are not always sufficiently aware of their short- 
comings, however, with the result that unwarrantable inferences and 
conclusions are not infrequently drawn from the figures. An attempt 
is made to appraise this important body of data by examining the basic 
methodology aad sources on which the estimates rest in many cases. 
Attention is drawn to common deficiencies and the practical possibilities 
of improvement through substitution of improved source data, pro- 
cedures, and concepts. 


1. INTRODUCTION 


LTHOUGH the great majority of countries now publish statistics of capital 
A formation, the unreliability of the estimates in a great many cases is 
probably such that only a doubtful—if not actually misleading—basis is 
provided for many of the analytical studies in which the figures play an im- 
portant role. In the underdeveloped countries, especially those in which con- 
scious efforts are being made to accelerate economic development by changing 
the structure or even character of the economic system, a distorted picture of 
the facts of the situation in the past and as they unfold in the present may, it 
is easy to see, have quite serious implications. 

The estimates of investment for a number of the more highly developed 
countries with elaborate statistical systems are no doubt now reasonably ade- 
quate for most purposes, provided levels and movements of minor components 
are not in question and small percentage changes even of broad totals are not 
taker too seriously. For countries at a fairly primitive stage of statistical devel- 
opment the esimates may serve as an indication of the order of magnitude 
(which, if set side by side with a measure of total production such as gross 
national product, can give an indication of the use of resources), but hardly 
more. Most estimates fall in between these two classes in quality, that is to 
say, they are useful in various connections but always with reservations of a 
more or less serious kind. In general, these figures are of interest for the simpler 
kinds of structural studies but quite inadequate to support conclusions based 
on models relating investment levels to dynamic processes. Yet, if the rapidly 
growing body of economic studies dealing with the development of these 
countries and related topics is examined critically, innumerable examples come 
to light of unwarranted conclusions being drawn from investment data with 
margins of error too large to permit their use in this way. As academic or theoret- 
ical investigations such studies may be highly suggestive and give valuable 
insights into relationships, but the all-too-human temptation to claim veri- 


* The author’s views are based largely on direct observation of the national accounts work of various Asian 
and Latin American countries. Although he alone is responsible for the opinions expressed, he wishes to thank 
R. C. Geary and other colleagues of the National Accounts Branch of the United Nations Statistical Office for 
helpful discussions. 
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similitude and a “useful” result often leads to the glossing over of the weak 
and sometimes even missing links in the chain of analysis. (No doubt in many 
cases the author himself may be supremely oblivious to the shortcomings of his 
data.) 

What are the reasons for the weakness of the capital formation estimates of 
the underdeveloped countries? Obviously the unsatisfactory state of much of 
the basic statistical information is a leading reason. Another cause is the nature 
of these economies, i.e., the important role played by nonmonetary activities, 
the large number of small-scale producers, the dominant position of agricul- 
ture, the absence of highly organized markets. In a way these attributes are 
themselves related to the retarded state of statistics, since many census and 
collection activities are hampered by such economic arrangements; moreover, 
as we shall see, they give rise to special problems of a conceptual kind. To these 
various obstacles one must add the shortage of skilled statistical personnel and 
lack of experience generally in the more complex and sophisticated forms of 
empirical statistical research. 

The gloomy prospect of achieving useful results painted by these remarks 
is, however, rather exaggerated. The fact that most, and somtimes practically 
all, capital equipment is imported in underdeveloped countries is an inestima- 
ble advantage. The relatively simple structure of many of the countries also 
simplifies the task. Finally, the experience of countries farther advanced in 
such research is at the disposal of countries just beginning. These advantages 
may go far to offset the various obstacles, even if they do not quite balance 
them. 

The object of this paper is to appraise the body of capital formation statistics 
that now exists for the underdeveloped countires. This will be done by examin- 
ing the way in which the estimates are put together, with attention to sources 
of basic data and methods, as well as to the problems that exist and the weak 
spots to which they give rise. At the same time an attempt will be made to 
indicate solutions to some of the larger problems exposed. In order to avoid 
burdening the reader with unessential details, existing differences in concepts 
and methodology will be brought out only where they are important; in the 
author’s opinion the similarities go far enough for an adequate general ap- 
praisal to be possible without exploring the circuitous byways of individual 
country practices. A disadvantage of such an approach is, of course, that the 
reader may be misled into believing that the discussion applies accurately to 
individual cases. It is therefore better to explain at the outset that it does not 
and to plead only that without such simplification based on a sifting of the evi- 
dence the author and reader both would soon be bogged down in a morass of 
special cases. 

2. THE INVESTMENT IN EQUIPMENT 

The general approach ordinarily used by the countries with which we are 
concerned may be described as the production or commodity-flow method, in- 
asmuch as the method seeks to build up a total from the supply side, emphasiz- 


ing production and foreign trade statistics. Actually, most countries use this 
approach, but a considerable and growing number of countries, especially in 
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Europe, measure capital formation from the demand side by utilizing data on 
expenditures for capital goods by purchasers (collected, for example, by mail 
questionnaires set to firms).! It would not be proper to describe the latter 
method as more “advanced.” The most primitive territories use the production 
method; so does the United States, which measures producers’ durable equip- 
ment primarily by the commodity-flow technique, as applied to industrial 
census materials. Because of the overwhelming importance of imported equip- 
ment in less developed countries, the record of imports, i.e., trade statistics, 
replaces the production census as the core of the production method. Each of 
the two methods has clear-cut advantages and disadvantages: thus, the ex- 
penditure method applies equally to outlays for construction and for equip- 
ment and facilitates the classification of investment by industry or sector of 
use, the production method allows the statistician greater latitude in defining 
terms to that he is not bound by business accounting conventions, e.g., in 
distinguishing capital from current outlays. But these considerations do not 
really play an important part in the selection of an approach, and the prefer- 
ence for the production method over the other in the less industrialized countries 
stems from (a) the importance of capital goods imports and the relatively relia- 
ble nature of their import statistics, and (b) the existence of a large number 
of small-scale traders and producers, which would make the task of collecting 
capital expenditure data directly from business difficult. 

The analysis of imports with a view to segregating durable goods which will 
be acquired by firms from all other imports is something less than a straight- 
forward matter. The question of what is a capital good must first be faced. If 
the answer is that in principle a capital good is a good acquired by firms with 
an expected life of at least a year, the more controversial question remains: to 
what extent are expenditures on repairs and maintenance of capital to be in- 
cluded in capital formation?? The position taken has a direct bearing on the list 
of imported capital goods to be drawn up because a decision must be reached 
in regard to parts as distinct from finished goods. A solution advocated by the 
United Nations is that where such outlays do more than merely keep capital 
goods in a state of constant repair, i.e., where they extend the lifetime of an 
asset or raise its productivity, they should be capitalized.* This definition is, 
unfortunately, difficult to put into practice where import statistics form the 
starting point for estimating investment in equipment, since as a rule there is 
no way of knowing the precise uses to which purchasers will put particular 
parts. The notion held by the Scandinavian countries that all repairs and main- 
tenance except for daily upkeep are part of a nation’s gross investment is much 
easier to apply inasmuch as practically all parts may then be included.‘ Not- 


1 It should perhaps be mentioned that the expenditure method is not so much supplanting the older method 
as serving as a supplement to and check on it, and as a means of obtaining additional breakdowns of the capital 
formation total. 

2 For convenience many countries use two or three years as the line of demarcation between capital and non- 
capital goods, or else conform to general business accounting practice. 

See A System of National Accounts and Supporting Tables, Studies in Methods, Series F, No. 2, Statistical 
Office of the United Nations, New York, 1953. 

4 Application of the Scandinavian definition leads to the concept of investment sometimes termed “gross-gross 
investment.” For an account in English of Scandinavian practice in this respest see, for example, National Accounts 
Studies—Sweden, OEEC, Paris. It appears from this study that expenditures on maintenance and repairs under 
the comprehensive definition used average roughly 30 per cent of gross inveswuent in Sweden. 
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withstanding the greater convenience of the Scandinavian definition, the fact 
that it is at variance with commercial accounting precepts counts rather heavily 
against it. 

As a matter of fact, most countries lean towards a solution admitting only 
major alterations and renewals as capital expenditures, even if their definitions 
are not usually precisely formulated. Such a point of view is closer to the 
practice suggested by the United Nations than to Scandinavian practice, and 
leads to a useful criterion of selection among imports of parts. Whatever 
the position taken, however, it is important that the rule be followed con- 
sistently over time. In those cases where trade statistics are so organized that 
lists including and excluding doubtful items can be prepared, it is worth draw- 
ing up alternative lists to assess the importance of borderline cases and thus 
to place the problem in some perspective. 

One further point needs to be made as regards imported parts. The remarks. 
above refer to replacement parts, not to parts that will serve as components 
of new capital goods to be assembled within the country. Inclusion of the latter 
type in the list of imported capital goods will lead to duplication when the 
value of capital goods produced and assembled in the country is estimated at 
another stage. To avoid a situation where, for example, the value of tractor 
parts is later added to the value of tractors assembled from them, care must be 
taken to exclude such components. The same injunction holds true of build- 
ing materials if a separate estimate of construction put in place is to be made. 
Careful inspection of certain estimates that have been prepared in the past 
leaves one with the grawing suspicion that duplication of just this kind has 
taken place; even raw materials have found their way into estimates of invest- 
ment! Although the record of assembly parts in trade statistics is a complicat- 
ing element from this point of view, this record, as will appear later, offers a 
valuable point of departure for estimating the value of capital goods assembled 
in a country where more direct evidence is lacking. 

The most obvious problem arising from the use of trade statistics in estimat- 
ing equipment has of course to do with the allocation of commodities with 
alternative uses or of mixed groups of commodities occurring in the trade statis- 
tics. Where, for example, a durable good is normally acquired by both firms 
and households, it would be erroneous to lose sight of the fact that purchases 
by consumers must be regarded as consumption expenditure according to the 
definitions now almost universally accepted. Some goods or trade groups may 
even be producers’ goods, consumers’ goods, and intermediate goods or parts— 
the last named belonging in capital formation or not according to considera- 
tions of the kind discussed above. The practical question to which this leads 
may be put as follows: How much analysis and allocation of the available trade 
data are warranted? 

Actually, examination of the practices of a considerable number of countries 
contributes to the answer only slightly. While commodities belonging pre- 
dominantly in the producers’ goods, consumers’ goods, etc. category are nearly 
always allocated to the appropriate class in their entirety, practices vary widely 
where the more genuinely mixed groups are concerned. Some countries analyze 
a large number of trade groups from this point of view, others analyze ai. i 
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allocate only a very few, putting nearly all the mixed groups into capital forma- 
tion or leaving them out depending on whether they consist mainly of pro- 
ducers’ goods or not. Naturally the nature of a country’s imports and the detail 
in which its trade figures are published are relevant factors, but it does not 
seem to be the case that these factors are responsible for the different degrees 
of refinement to be found. 

The differences in refinement do not, however, seem to influence the end 
results to an important degree. Inspection of the import statistics of several 
underdeveloped countries, as well as the export statistics of the United States, 
the leading supplier of capital goods, makes it clear that the categories of 
durable goods that require allocation by receiving countries are relatively small 
in value, with the important exception of motor cars. The United States 
statistics for 1955 show, for example, that motor cars accounted for some- 
what over 7 per cent of the value of all machinery and transport equipment ex- 
ported, while car parts exceeded 8 per cent. By comparison, other finished 
durable goods and parts which were not exclusively producers’ capital were 
quite small and the fractions that might be allocated to consumers—e.g., 
parts of electric fans, sewing machines, radio receivers, refrigerators—very 
much smaller still. These facts point to the conclusion that exhaustive analysis 
of the mixed groups is unnecessary, and that only motor cars and their parts 
and possibly one or two other categories will ordinarily merit careful study; 
other categories, especially those of rather small value, may be allocated in 
some rough but reasonable fashion without affecting the over-all picture sig- 
nificantly. 

The apportionment of imported motor cars into those going to business and 
those going to households is usually based on car registration records, although 
where motor car imports are controlled import licenses are sometimes used. 
With one or two exceptions countries do not attempt to apportion expendi- 
tures for individual cars used partly for business and partly for pleasure and 
regard the purchases as for one purpose or the other depending on the main or 
stated use. It is not clear on what basis car parts are usually allocated, but a 
reasonable assumption on which to proceed would be to take the percentage 
applied to motor cars as applying also to car parts, after first eliminating 
assembly parts and minor parts for current upkeep. 

In order to arrive at the final purchase cost of imported equipment to ulti- 
mate buyers, the c.i.f. value of the imports must of course be raised to cover 
trade mark-ups, domestic transportation costs, customs, installation charges 
and any other costs entering into the final value of the asset in place. These 
charges taken together are normally fairly large in relation to the import values, 
even though the percentage relation may be expected to vary from country 
to country depending on trade practices, transport facilities, size of the country, 
etc., and from one type of capital to another depending on the trade channels 
normally used, bulkiness and other factors. For this reason it is not suprising to 
find that differer: countries apply different margins, the range being from under 
under 10 per cent to 50 per cent and more. What is surprising—and significant 
from the standpoint of assessing the accuracy of investment figures—is the 
fact that most underdeveloped countries apply the same percentage mark-up 


674 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1958 


to all clases of their imports. Further, it appears that this percentage is often 
arbitrarily determined, so that the wide variations found among countries 
cannot be accepted as reflecting real differences in cost structure at all, as 
might be thought at first glance. The figures for the over-all percentage mark-up 
on imports of equipment are shown below for a number of countries. 


Total percentage mark-up on 
Country imported equipment* 
Belgian Congo 20 
Burma 50 
Ceylon 30 
Indonesia 20 
Philippines 50 
Brazil 20 and 70t 
Dominican Republic 15 
Panama 15 


* Based on recent official sources. 
+ Twenty per cent applied to direct overseas purchases by firms, seventy per cent 
to purchases through importers. 


Whereas the more developed countries may resort to censuses of distribution 
price control records, the advice of trade specialists, etc. in arriving at distri- 
bution and other costs, a broad basis is often lacking in other countries. It is 
far from clear, however, that the required effort is made to obtain the best esti- 
mate possible in the circumstances. Considering that imports of equipment 
form such a large part of total fixed investment in these countries, and that 
the various costs of distribution are substantial in relation to import values, 
an effort commensurate with the importance of the required adjustment is 
called for here. 

While the sources available for this purpose naturally vary, not much use 
seems to be made of a variety of potentially useful avenues of information. 
List prices issued by importers of agricultural and industrial machinery are 
one possible source. As regards agricultural machinery, farm credit agencies 
which extend loans to farr ers for specific purchases are often in an excellent 
position to supply information on prices of leading types of imported farm 
equipment. Authorized agencies selling motor trucks and cars of foreign make 
are ready sources of price information on these items. Large firms, govern- 
ment monopolies, railroads, ete. which import major items of equipment are 
yet another source. Such equipment, often built to buyers’ specifications, is 
normally imported directly, eliminating the middleman’s mark-vp. Where 
ready-made data are scarce, the use of a small-scale survey to collect data on 
the mark-ups of leading equipment importers and on the selling prices of lead- 
ing items should be considered. Clearly the various possibilities for improving 
on the more or less offhand across-the-board percentage addition now so com- 
mon must be numerous. 

In addition to imported equipment, the equipment produced or assembled 
in the country has, of course, to be taken into account. In the less industrialized 


INVESTMENT ESTIMATES OF UNDERDEVELOPED COUNTRIES 675 


countries the aggregate value of such equipment usually is small by compari- 
son with what is purchased overseas; in some countries only 5 to 10 per cent of 
the total bill for machinery and equipment, or even less, is for home produced 
durable goods under this heading. Where the fraction is as small as this even 
a rough estimate will suffice, but greater care is required in other cases. Actually, 
the materials for a reasonable approximation exist in most countries: more than 
60 countries carry out censuses of production or manufacturing, and while 
many of these are not annual affairs more and more countries are supplement- 
ing their censuses by annual surveys. The existence in many underdeveloped 
countries of a very large number of small businesses of the kind that defy 
enumeration does not give rise to special difficulties here since such enterprises 
seldom produce capital equipment (as distinct from carrying out minor re- 
pairs). 

Where annual census or survey data on the output of producers’ durable 
goods are lacking, selected indicators of business activity may be used to extra- 
polate census benchmark estimates. In countries where capital goods assembled 
from imported parts are important, import statistics may again play an in- 
dispensable role if more direct information bearing on the finished output is 
not available. 

Production valued at factory prices must of course be raised to final cost 
levels. In most cases the distribution margins added are as unsatisfactory as 
those used in revaluing imports. This problem has been discussed in connection 
with imports of equipment and the remarks apply equally here. 

Another point, not touched on thus far, has to do with stock changes of im- 
ported and domestically produced equipment. In principle, such changes have 
to be taken into account, since if net additions to stocks take place fixed invest- 
ment is diminished, and vice versa. In practice the majority of countries find it 
impossible to make allowance for this factor, and it is doubtful whether under 
normal conditions the error introduced is serious. Careful inspection of figures 
for stock changes in general, where these changes are measured, often leads 
one to suspect that the figures have little real value except in a few statistically 
advanced countries. Often stock changes as an item in the national accounts are 
deliberately restricted to the changes in the inventories of principal export prod- 
ucts, livestock holdings and other readily measurable quantities, and in such 
cases more confidence can be placed in the limited results. It would definitely 
not appear that the possibilities of making realistic estimates of the annual 
changes in inventories of capital goods held by producers or in trade channels 
are very good, and extraordinary efforts in this direction would not be com- 
mensurate with the small importance of the adjustment in underdeveloped 
countries. If under special circumstances it is felt necessary to isolate stock 
changes—e.g., where it is known that import controls, heavy duties, etc. are to 
be imposed and in anticipation imports are running at abnormally high levels— 
a rough adjustment may be based on correspondence with trade association, 
special surveys, possibly trade census data. Still another adjustment that is 
required is for exports of capital goods. While in the least developed countries 
such exports will be negligible and can be disregarded, where this is not so the 
foreign trade statistics will lead to the information needed. 


. 
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3. THE INVESTMENT IN CONSTRUCTION 


So much, then, for the problems and snares of estimating the investment in 
machinery and equipment. Unfortunately, the difficulties of obtaining a sat- 
isfactory measure of the investment in buildings and other construction and 
works—the larger part usually of total fixed capital formation—are at least 
equally great. Very few underdeveloped countries take censuses of the con- 
struction industry and a broad, consistent body of data on construction activity 
is therefore lacking. (It goes without saying that countries which do not use 
the expenditure method to measure equipment will not be in a position to de- 
duce construction in this fashion either.) In these circumstances, what are the 
usual source materials for the estimates? The materials seem to be of two 
kinds, leading to two distinct methods: (1) building permits, licenses or similar 
records, supplemented by the public accounts for public building, and (2) 
statistics of building materials used in construction work. 

The use of building permits is fairly common. Records of this type usually 
specify the cost or value of the construction to be undertaken. Where cost is 
not given it can be estimated by multiplying the number of square feet, cubic 
yards, etc. of construction shown in the permits by an appropriate average 
cost per such unit of construction, possibly with allowances for different unit 
costs in different regions, for different kinds of structures, etc. Where informa- 
tion from permits or other local records i.. zollected on a sample basis by ques- 
tionnaires addressed to local officials, a suitable basis for “grossing up’’ the sam- 
ple results must be found. One basis is to determine the value of construction 
per capita for cities, towns, villages, etc. of different population size and to 
use these figures in establishing a national total. 

Although most countries give little or no indication of the actual techniques 
used to arrive at a total from building permits data, there are reasons for be- 
lieving the esimates to be fairly crude, if only because of some inherent short- 
comings in the approach. In the first place, it is a fact that in some countries 
permits or licenses are required only in cities, or for certain types of construc- 
tion, or for work exceeding a stated value; in such cases permit valuations must 
furnish a most precarious basis for estimating construction. Secondly, actual 
construction put in place need not correspond to estimates shown on permits 
since unlicensed building usually takes place, permits lapse without building 
being started, rates of construction vary and so on. Due allowance for these 
factors is not usually made. It would seem that occasional surveys to relate the 
actual value of construction in successive months to the value of permits issued 
would be one satisfactory means of assessing and improving present estimates. 

For public construction the public accounts, published annual statements of 
public corporations, etc., are used. Information for local governmental units is, 
however, all too often inadequate. A common technique is therefore to raise a 
sample of local government construction expenditures to the universe level by 
the use of an appropriate factor, such as population. Even so, the fact that the 
available figures are usually in need of detailed reclassification and adjustment 
if they are to conform to accepted definitions of investment suggests that un- 
less corrections are made the results will suffer from the idiosyncrasies of local 
budgetary practices. 
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The second method mentioned, that of basing construction estimates on 
building materials used up, has obvious attractions for underdeveloped 
countries with reasonably good data on building materials and supplies. The 
most careful estimates made along these lines start with the value of building 
materials produced and imported and add other cost elements, such as trans- 
portation changes and mark-ups for materials, wages in the building trades 
and profits of builders. More often, however, the estimating procedure consists 
of multiplying the value of a small number of imports of basic buiiding mate- 
rials by some factor to yield an approximate measure of the sum total of all 
building. In one African country the construction estimate is based on cement 
imports alone. The fallacy of building up estimates from too narrow a base 
should be clear from the mere fact that especially in the less developed countries 
many forms of construction require hardly any materials at all or‘only simple 
native materials (consider land reclamation and road building). Where actual 
studies indicate a fairly persistent relationship between the value of basic 
building materials and the full cost of construction, the method is a convenient 
one to use, but the invitation offered to dangerous oversimplification has to 
be kept in mind. 

Irrespective of the general approach to measuring investment in construc- 
tion, it is extremely unlikely that the bulk of own-account construction by 
farmers will be covered without special efforts. Since improvements to their 
farms made by farmers may account for a sizeable fraction of the gross invest- 
ment in the agricultural sectors of the less industrialized countries, the omis- 
sion of such investment may well limit the usefulness of capital formation 
statistics for many purposes, especially where policies are being pursued to 
develop agriculture, where the agricultural population is growing rapidly, etc. 
Direct improvements to farms may take many forms, including the construc- 
tion of barns, fences, roads, the digging of wells and irrigation ditches, the de- 
velopment of orchards and plantations, and the reclamation, drainage, and ter- 
racing of lands. Not many countries make adequate allowance for these capital 
improvements, probably because of their preoccupation with investment in 
heavy industry and because of the statistical difficulties involved. 

These difficulties are of two kinds. In the first place, the extent of such 
construction activity must be determined. Secondly, a suitable basis for valuing 
the construction must be found. As regards the first problem, it is difficult to 
see how farm improvements can be known unless agricultural censuses, or even 
better, surveys are utilized for this purpose. In one or two countries where a 
national farm survey is taken annually, the accounts kept on each of the sam- 
ple farms include a record of the materials purchased for capital purposes and 
the cost of labor engaged in capital work. In the majority of cases where some 
kind of estimate of own-account farm construction is made, however, informa- 
tion of this kind is altogether lacking and only indirect evidence is used. 

Aside from the statistical requirements for a useful estimate, the question of a 
basis for valuation of these assets arises. It is possible, for example, to use the 
cost of materials and paid labor, or of materials and all labor (i.e., including im- 
puted wages for work done by members of the family); a different method 
would be to use the cost of similar construction work done on a commercial 
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basis. While the first method, valuation at actual cost, is the simplest and 
avoids imputations, it is clear that applying the method to projects carried out 
with unpaid labor and simple materials will lead to small or even zero values, 
For many purposes it would seem to be preferable, therefore, to value the un- 
paid labor at local wage rates. In any case, where construction work done on this 
basis is believed to be important, it would be wise to show such construction as 
a separate item in the statistics of capital formation and to explain how the 
values set down have been reached. 


4. CONCLUSIONS 


In the limits of a brief paper it is not possible, of course, to go deeply into 
the methods of estimating a complex aggregate such as capital formation, but 
even the discussion above will suggest to the reader the many steps involved 
and the difficulties inherent in carrying them through successfully. In addi- 
tion to the statistical inaccuracies, it should be remembered that there exist 
also certain differences in concepts.’ A few of the conceptual issues have been 
mentioned but there are others, e.g., the treatment of public expenditures for 
durable military equipment and installations. (Are such expenditures to be 
treated as capital or current?) Furthermore, the whole of the discussion has 
referred to the estimation of gross fixed capital formation. What of changes in 
inventories which must be taken into account in measuring total domestic in- 
vestment? Except for changes in livestock holdings and in stocks of major 
export products the statistical sources in underdeveloped countries are mostly 
inadequate for measuring stock changes. And what of depreciation for which 
figures are needed if net investment is to be shown? Aside from the conceptual 
problems (e.g., is depreciation to be measured by allocating original costs over 
the average lifetime of assets or by allocating replacement costs?), the statis- 
tical deficiencies in corporate tax statistics, the lack of uniformity in business 
accounting procedures, etc., suggest that here again only very tentative esti- 
mates may be possible. 

Another matter that has not been mentioned is the composition of gross fixed 
capital formation. For many economic studies it is not so much the scale of 
capital development but its character that is important. Whether the question 
is one having to do with differences in capital productivity, the channeling of 
resources into domestic investment, structural changes, etc., the need of de- 
tailed figures is obvious. Fortunately, the general approach described lends it- 
self to a classification of capital by type of good, particularly as regards pro- 
ducers’ equipment. In building up a picture of the flow of capital to the various 
economic sectors a loss of precision is involved. While tractors, mining equip- 
ment, and many other capital goods can readily be allocated to the industry 
where they will be used, a large number of less specialized types cannot be 
unambiguously distributed. To the extent that allocation can be carried out 
successfully, a useful cross-classification by purchasing industry and type of 


5 Cf. Phyllis Deane, Colonial Social Accounting, Cambridge, 1953, p, 4. Writing of national income problems 
in African countries, the author says: “. .. it is not the margin of error arising from inadequate statistical data 
that hinders most the application of national income estimates to practical policy purposes, it is the fog that sur- 
rounds the concepts themselves.” Admittedly the conceptual proble:.is are more serious in African countries than 
anywhere else. 
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good will result. Whether government investment can be distinguished from 
private investment will depend in turn mainly on the coverage and char- 
acter of the public accounts, central and local. In general, the most reliable 
breakdown will therefore be that of equipment by type; the distributions by 
industrial and institutional sectors, while attainabie, will be cruder. 

Quite apart from the particular deficiencies brought out or suggested in this 
paper, it would seem that for many countries the evidences of internal incon- 
sistencies in the investment statistics, the magnitude of the successive revisions 
which the figures undergo, and the conflicting estimates in circulation are in 
themselves effective arguments against the uncritical use of the figures. In time, 
of course, the estimates will improve. Until that time arrives, however, the 
economist interested in reasoning realistically about the problems of the un- 
derdeveloped countries would do well to keep in mind the limits of the data 
with which he has to work. Whether by the time the data are all that could be 
wished the underdeveloped countries will not already have joined the ranks 
of the developed is not a question which the author has set himself to answer. 


MANUFACTURERS’ INVENTORY CYCLES AND 
MONETARY POLICY* 


Doris M. E1sEMANN 
The RAND Corporation 


The availability of credit is one of the factors which many business- 
men must take into account when planning their inventory policy. 
When inventories are rising rapidly, firms become increasingly depend- 
ent on bank credit, and a change in credit policy may have an important 
influence on inventory fluctuations. This study attempts to measure 
the impact monetary policy might have on inventories, and to examine 
the limitations such a policy might face. 


HE accumulation and liquidation of businessmen’s inventories has long been 

recognized as one of the unsettling forces of our economy. Since business- 
men often finance their inventories from bank credit, the question arises as to 
the influence which the availability of credit has on inventory cycles, and also 
the possibility of mitigating these fluctuations through credit control. 

This study attempts to point out both the possibilities and the limitations of 
using monetary tools to control inventory cycles. Fluctuations in interest rates 
are not viewed as necessarily initiating changes in businessmen’s inventory 
policies, but rather the availability or cost of credit is considered as a limiting 
factor for these policies. Because of the heterogeneous character of manufac- 
turing firms, the data were analyzed by industry group and by size group. The 
scope is limited to the period since 1947. 


THE ROLE OF BANK CREDIT IN FINANCING INVENTORIES 


The proportion of inventories financed by bank borrowing cannot be measured 
with precision, since firms do not generally separate credit needs among their 
various short-term assets. A further complication is that the ratio of loans to in- 
ventories does not tend to be stable, but rises and falls with the inventory level. 

The average ratio of short-term bank loans to inventories for manufacturing 
was 12 per cent for the period 1947—1955.' Therefore, on average, somewhat 
less than 12 per cent of inventories was financed from bank borrowings. How- 
ever, between the first quarter of 1950 and the first quarter of 1952, a period 
of rapid inventory accumulation, the ratio of additional loans to the increase 
in inventories was a little over 25 per cent.? The ratio rose from 18 per cent for 
the first half of 1950 to 22 per cent during the second half, and to 32 per cent for 


* This study is based on a paper presented at The American Statistical Association Annual Meeting session 
on Business Inventories, September 10, 1957, in Atlantic City, New Jersey. Most of the basic research was done 
while I was a member of the Economic Policy Commission of the American Bankers Association, where I benefited 
from the critical insight of E. Sherman Adams and Murray G. Lee. Thanks are also d:ie to B. H. Beckhart in whose 
Banking Seminar at Columbia University many ideas were clarified, and to Ruth P. Mack whose encouragement 
and suggestions were of great value. Finally, I wish to thank the Cost Analysis Departzaent of the Rand Corporation 
for providing the time necessary for completing this paper. 

1 The data are from the Federal Trade Commission—Securities and Exchange Coinmission Quarterly Financial 
Report for Manufacturing Corporations. 

2 For lack of a better measure, we assumed that the increase in loans accompanying an increase in inventories 
is a rough approximation of the additional inventories financed by bank credit. Even if some of the incremental 
borrewing was caused by factors other than inventories, if inventories had not risen, funds would have been avail- 
able for these other needs. 
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the first quarter in 1951, which was the peak rate of inventory accumulation. 
After this the ratio began to decline. During this same period the liquidity ratio 
of manufacturing firms was falling. This emphasizes the increasing importance 
of bank credit as a means of financing inventories during the upswing of the 
inventory cycle, especially near the peak. 

During the following period of inventory liquidation the pattern was re- 
versed. Loans were curtailed more sharply than stocks, and the ratio of loans 
to inventories shrunk. For the period as a whole, percentage loan fluctuations 
were about three times larger than inventory changes for total manufacturing, 
ranging from two and a half to seven times the percentage inventory changes 
for the individual industries. Because of this tendency for businesses to borrow 
proportionately more during the upswing of an inventory cycle, and to cut their 
borrowings more sharply when inventories are shrinking, the over-all ratio of 
loans to inventories tends to understate the potential impact of a credit re- 
striction. 

INDUSTRY VARIATIONS 


From the standpoint of stabilizing inventory cycles, it would be desirable 
that the industries most sensitive to cyclical inventory fluctuations have a high 
ratio of short-term loans to inventories, and a close correlation between changes 
in loans and in inventories. Actually, industries with large inventory cycles 
were not distinguished from the other industries in either respect.* A tighten- 
ing of credit would therefore impede the inventory build-up in those industries 
which do not tend to be cyclically sensitive just as much as in the other in- 
dustries. This is apparent from Table 682 which lists the average amplitudes of 
inventory changes, the ratios of loans to inventories, and the correlations 
between percentage changes in loans and in inventories for the twenty-two 
manufacturing industries. When the industries were grouped into durables and 
nondurables, the durables were found to be more subject to cyclical fluctuations, 
but to borrow relatively less, than the nondurables. 

There was considerable variation in the general inventory level of the differ- 
ent manufacturing industries. The industries with high inventory levels in 
relation to their assets generally borrowed more relative to assets than the 
other industries. The coefficient of correlation was .895. Furthermore, during the 
period of rapid inventory investment, between 1950 and 1952, industries which 
increased their inventories more rapidly, percentagewise, also borrowed more 
than the other industries (Table 683). Machinery, nonautomotive transporta- 
tion equipment, and instruments, all of which were strongly affected by the 
Korean War, financed almost one-third of their additional inventories by in- 
creasing their bank debt. Aithough these industries held less than one-tenth 
of the loans outstanding at the beginning of the period, they accounted for two- 
fifths of the increase in loans, and for more than a third of the inventory in- 
vestment. 


§ The correlation for both of these was close to sero. There was, incidentally, a tendency for industries which 
had a high loan inventory ratio also to have a closer relationship between changes in loans and in inventories. 
The coefficient of rank correlation was .56. 

4 The correlation between percentage changes in inventories and percentage changes in loans for the twenty-two 


manufacturing industries was .636. 
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TABLE 682 


MANUFACTURERS INVENTORIES AND SHORT-TERM BANK LOANS 
22 MANUFACTURING INDUSTRIES, 1947-1955 


Correlation Be- 
Average Amplitude of Quarterly Ratio of 
Percentage Changes* 
Inven- | age Changes 
tories | in Loans and 
Industry Inventories 
Inventories Loans 
Season- 
(per |Unad- ally 
Cycli- Sea- Cyeli- Sea- 
1 Total Total} cent) | justed Ad- 
justed 
All Durable Goode 1.08 .90 1.18 | 2.87 2.86 4.05 9.5 -65 .82 
Other transportation equipment 1.938 1.36 2.40 | 5.77 3.80 8.22/| 15.3 47 
Motor vehicles & equipment 1.67 1.73 2.06 | 3.51 4.96 5.48 7.5 -68 .57 
Fabricated metal products 1.36 2.00 1.96 | 3.09 11.16 10.49 12.0 -72 .78 
Electrical machinery, equipment & supplies | 1.33 1.22 1.50 | 3.74 3.66 5.25 7.7 .63 -71 
Furniture & fixtures 1.31 .85 1.48/| 2.40 2.85 3.95/| 12.5 -40 
Instruments & related products 1.27 1.36 1.50/| 5.61 4,23 8.01 11.7 44 47 
Lumber & wood products 1.22 1.83 1.91 | 3.46 3.05 5.02/| 13.2 .48 .57 
Primary nonferrous metal 1.13 1.20 1.64] 3.65 3.77 5.02 5.8 42 -41 
Machinery (except electrical) 1.11 .86 1.23 | 4.48 4.09 5.83] 10.0 -77 .88 
Primary iron & steel 1.09 2.22 1.96 | 5.85 4.68 7.76 3.8 -23 -26 
Miscellaneous (including ordnance) 1.08 .71 1.45] 2.28 5.038 5.40} 15.1 -53 -76 
Stone, clay, & glass 1.66 1.56 | 3.73 4.51 5.17 6.5 .43 
All Nondurable Goods 68 .83 1.00]; 1.80 2.42 4.36) 14.5 -70 84 
Apparel & related products 1.62 2.39 2.54 3.16 7.27 7.70] 23.8 -62 -89 
Textile mill products 1.31 1.33 1.74 3.53 3.63 4.92| 15.2 -64 -88 
Paper & allied products 1.16 1.02 1.29 | 2.37 4.82 5.29 7.9 -15 
Rubber products 1.07 1.74 1.82 | 4.67 7.99 8.79 5.6 .50 -21 
Leather & leather products 1.04 1.21 1.51 | 3.85 10.15 11.12| 19.2 .09 -29 
Petroleum & petroleum products 98 1.39 1.48 | 4.27 4.59 6.27 3.9 .35 44 
Chemicals & allied products 86 2.10 2.09 | 2.81 13.90 14.11 9.7 91 .63 
Printing & publishing 76 1.38 1.48) 1.67 6.64 6.93 | 16.7 .59 .58 
Food & kindred products 69 2.20 2.08 | 1.90 8.73 8.48 22.3 .93 -82 
Tobacco manufacturers 58 1.72 1.70 | 3.16 9.82 10.32 | 13.8 .93 .83 
All Manufacturing 84 .66 1.86 2.20 2.79) 11.9 .67 


Source: Federal Trade Commission—Securities and Exchange Commission, Quarterly Financial Report for 
Manufacturing Corporations, 1947-1955. 

* The average amplitude is the average, without regard to sign, of the percentage changes. The seasonal com- 
ponent is the seasonal adjustment factor; the cyclical compone. -‘ is a weighted 15-month moving average of the sea- 
sonally adjusted series. These are monthly changes based o. quarterly data. 


From the previous discussion, it might be expected that the industries which 
expanded their inventories more rapidly between 1950 and 1952 would also 
have a higher ratio of additional loans to increased inventories than the other 
industries.’ However this was not the case, perhaps because of the variation 
in the average ratio of loans to inventories among different industries. Thus 
food, which has a high loan to inventory ratio, also had a relatively large ratio 


5 This, of course, refers to the tendency we found for manufacturers to increase the proportion of loans to 
inventories as the inventory accumulation gained momentum. Further evidence of the failure of individual industries 
to reflect this tendency is the fact that there was no correlation between the size of cyclical inventory and loan 
fluctuations. 
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TABLE 683 


ABSOLUTE AND RELATIVE INCREASE IN LOANS AND INVENTORIES BE- 
TWEEN THE FIRST QUARTER OF 1950, AND THE FIRST QUARTER 
OF 1952, 22 MANUFACTURING INDUSTRIES 


|Absolute Increase | Relative Increase 
Inven- | in Loans 
Loans Loans as % 
Industry tories tories Abso! ste 
Incre«se in 
(millions of (per . 
dollars) cent) Inventories 
Tobacco 208 270 79 15 77.0 
Food 740 1,474 103 37 50.2 
Machinery (except electrical) 819 2,096 539 68 39.1 
Other transportation equipment 500 1,303 1,064 181 38.4 
Instruments and related products | 133 388 578 77 34.3 
Textile mill products 271 934 108 46 29.0 
Fabricated metal products 241 867 236 57 27.8 
Chemicals and allied products 283 1,081 156 52 26.2 
Apparel and related products 108 432 52 46 25.0 
Leather and leather products 30 123 31 26 24.4 
Miscellaneous (incl. ordnance) 71 303 145 49 23.4 
Motor vehicles and equipment 211 1,036 188 53 20.4 
Lumber and wood products 43 267 63 50 16.1 
Primary iron and steel 87 609 272 40 14.3 
Printing and publishing 25 183 42 43 13.7 
Electrical machinery, equipment, 
and supplies 200 1,459 444 93 13.7 
Rubber products 35 428 992 64 8.2 
Furniture and fixtures 15 192 24 69 7.8 
Stone, clay, and glass 17 297 45 48 5.7 
Petroleum refining and products 
of petroleum and coal 1 460 1 23 0.2 
Primary nonferrous metal 0 197 0 23 0 
Paper and allied products -—7 378 -11 53 -1.9 


Source: Federal Trade Commission—Securities and Exchange Commission, Quarterly Financial Report for 
Manufacturing Corporations. 


of additional loans to additional inventories, even though it only had a mod- 
erate increase in inventories. Electrical machinery, on the other hand, which 
generally borrows little in relation to its inventories, increased its inventory 
investments substantially with relatively little additional borrowing. 

In summary, the ratio of loans to inventories was neither smaller nor larger 
in the industries marked by large inventory cycles. Further, the marginal ratio 
of loans to inventories during the 1950-1952 expansion was independent of the 
rate of inventory expansion. The industries with high levels of inventories and 
those with a rapid inventory expansion did borrow more than the other in- 
dustries, but this was related to the fact that they purchased more stocks, 
rather than that they financed a larger proportion of these stocks from bank 
credit. 
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VARIATIONS BY SIZE OF FIRM® 


Under certain conditions, monetary authorities may be chiefly interested in 
lowering inventory levels of firms of a particular size group. This, for instance 
was the case in August 1957, when money was held tight in an attempt to 
break administered prices. The theory was that the inflation was caused by 
firms keeping their stocks off the market in the hope of obtaining higher prices 
at a future date. Restraints on bank lending would reduce the ability of sellers 
to hold on to these inventories.’ Since administered prices are chiefly a phenom- 
enon of large firms, the success of this policy depended on the responsiveness 
of large firms to a credit squeeze. 


TABLE 684 


RELATIONSHIP OF LOANS TO INVENTORIES BY SIZE OF FIRM, 
1951-1956 


Correlation between Quar- 
terly Changes in Loans 
and Inventories 


Asset Size Class Ratio of Loans 
(millions of dollars) to Inventories 


-894 
-833 
-925 
-916 
-787 
-508 
-350 


| 


Soe 


100 and over 


Source SEC-FTC Quarterly Financial Reports for Manufacturing Corporations. 


Table 684, indicates that the inventory level of large firms is not very sensi- 
tive to changes in monetary policy.* The ratio of short-term loans to inventories 
for these firms was relatively low, and the correlation between changes in 
loans and inventories was insignificant. In addition, large businesses are less 
likely to feel a credit squeeze than smaller firms (this will be discussed in more 
detail later). It therefore seems unlikely that administered prices could be 
effectively dealt with through a credit restriction. 

Both the ratio of loans to inventories and the degree of relationship between 
changes in loans and inventories were largest for the medium sized corpora- 
tions, those with assets between $1 to $5 million. Smaller firms had a slightly 
lower ratio and coefficient of correlation, and firms with more than $50 million 
seemed least dependent on bank loans for their inventory investments. As 
medium sized firms also carried more inventories in relation to their assets 
than the other firms, a credit restraint would tend to hamper inventory invest- 
ments more in this group. 


* Since the size classification was not available by individual manufacturing industries, but only for total 
manufacturing, the variations found among the different size groups may be a reflection of industry characteristics 
rather than size characteristics. 

1 See testimony of William McChesney Martin in, Hearings Before the Committee on Banking and Currency 
House of Representatives on 8-1451, and HR-7026, Pt. I, pp. 420-22. 

§ The asset size breakdown was not available until 1951. This section is therefore based on data from the first 
quarter of 1951 to the second quarter of 1956. 
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SEASONAL FLUCTUATIONS 


Whatever the initial action undertaken to restrict credit, it is quite certain 
to result in a rising interest rate. This rise in the interest rate will be as much 
of a burden to firms building up inventories to meet the seasonal peak in their 
business, as it will be to those purchasing inventories for speculative or other 
reasons. While bankers will generally try to accommodate the seasonal needs of 
their regular customers, even when credit is tight, the higher price these cus- 
tomers will have to pay for this credit may present a real hardship to them. 

For most industries, seasonal inventory fluctuations were larger than cy- 
clical inventory swings. Table 682 listed the average size of seasonal inventory 
and loan fluctuations for the 22 manufacturing industries during the 1947- 
1955 period. Taking a simple average of all the industries, the amplitudes of 
the seasonal swings were 1.3 times larger than the cyclical ones for inventories, 
and 1.7 times larger for loans.*® 

Approximately half of the seasonal inventory expansions were financed by 
short-term borrowings in the chemicals, food, tobacco, and metals industries.'° 
The apparel industry, however, which had the largest seasonal inventory fluc- 
tuations, financed less than one-third of these changes through bank loans. 
The prevalence of factor financing in this industry may account for this. 

Industries with substantial seasonal fluctuations were neither more nor less 
subject to cyclical influences than the other industries (Table 682). The indus- 
tries which would inadvertently suffer most from a credit restriction resulting 
in higher credit costs are those with large seasonal inventory fluctuations, but 
relatively small cyclical swings, such as food and chemicals. 


THE CREDIT SIDE 


It is generally believed that inventories are “the current asset most closely 
linked to the use of short term credit.”"' However, inventories are by no means 
the only use for business borrowing. Short-term funds are also used for carrying 
accounts receivable, meeting payroll expenditures, paying taxes, and purchas- 
ing equipment. Both the relative importance and the relative elasticity of these 
other assets in respect to credit will determine the extent to which monetary 
policy can influence inventories. 

The belief that inventory borrewing represents a large slice of business bor- 
rowing was substantiated by our findings that the industries with high inven- 
tory levels relative to assets generally borrowed more than the other industries. 
However, we have no knowledge as to just how large this slice is. 


* For total manufacturing, however, the seasonal swings were smaller than the cyclical ones for inventories 
(the ratio was 0.8) and only slightly larger for loans (1.2). This reduction in the importance of seasonal swings 
when the various manufacturing industries are combined is explained by the variation in timing of the seasonal 
peaks and troughs of the different industries. When the industries were aggregated, the seasonal! fluctuations can- 
celled out to some extent. This was more true of inventories than loans, since certain needs for short-term credit, 
notably taxes, affect all industries at the same time. This, incidentally, explains the considerable difference between 
the correlations of the adjusted and the unadjusted data for total manufacturing. The latter was lower than the 
former, because factors other than inventories influence seasonal loan fluctuations more for total manufacturing. 

1° To measure the proportion of inventories financed by bank loans we used the ratio of additional loans to 
additional inventories from the seasonal low to the seasonal peak. 

u Jacoby and Saulnier, Business Finance and Banking, National Bureau of Economic Research, Inc., New York, 
1947, p. 86. 
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The relative elasticity of the various short-term assets is even harder to 
measure. It is well known that taxes enjoy with death the distinction of being 
completely inelastic. Meeting payroll expenditures is probably also less elastic 
than inventory purchases. The elasticity of the other assets, however, probably 


RATIO OF LOANS TO ASSETS 
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ASSET SIZE CLASS IN 
MILLIONS OF DOLLARS 


ah 


1951 1952 1953 1954 1955 1956 
SOURCE: S.E.C.-F.T.C. QUARTERLY FINANCIAL REPORTS 
FOR MANUFACTURING GORPORATIONS 
SHADED AREAS REPRESENT PERIODS 
OF CREDIT RESTRAINT 
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changes with circumstances. Until 
more is known as to which assets will 
be reduced when credit is cut, the im- 
pact of a change in credit conditions 
on inventories cannot be measured 
with precision. 

Another element of uncertainty is 
the extent to which a credit squeeze is 
actually effective in restricting funds. 
In the short run, some firms will be 
able to find alternate sources of funds 
if they are refused a bank loan. This 
may take the form of increased trade 
credit, factor financing, etc. In the 
long run, however, these other financ- 
ing sources will also feel the impact of 
tighter money conditions, and will be 
unable to satisfy the increased demand. 

The ability to obtain funds when 
credit is tight varies among firms. Dis- 
cussion lately has centered around the 
difficulty small business faces in secur- 
ing bank credit. This difficulty is not 
so much a matter of the size of the firm 
as it is of the credit risk involved. 
Small firms generally have a higher 
mortality rate and are younger than 
larger firms. While the delinquency 
rate of siaall business may not be very 
large, this could be due to the fact that 
they are refused credit more often than 
the larger firms.'? Looking at the ratio 
of bank loans to assets since 1951 
(Chart 686), the borrowings of small 
firms shrunk during periods of credit 
restraint, while large firms were still 


able to expand their borrowings. The responsiveness to monetary policy seems 
to have had a direct relationship to the size of the borrower. 


1 In a survey of external financing of small and medium sized businesses, the U. 8. Department of Commerce 
found that the ratio of firms obtaining all funds requested varied directly with size, ranging from 49 per cent for the 
smallest businesses to 68 per cent for the largest size group. The ratio of firms which failed to obtain any of the funds 
they needed from banks varied inversely with size, ranging from 20 per cent to 5 per cent. McHugh and Ciacco, 
“External financing of small- and medium-size Business,” Survey of Current Business, U. 8S. Department of Com- 


merece, Office of Business Economics, October 1955, p. 18. 
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This probably applies to nonbank credit as much as to bank credit. Large 
firms ran raise money in the capital markets, and will find it easier to borrow 
from other lenders for the same reason that banks are more willing to lend 
funds to them—they are better credit risks. The inventory levels of large firms 
are therefore independent of credit conditions not only because they borrow 
less in relation to their inventories than the other firms, but also because they 
find it easier to obtain funds when credit is tight. 


LIBERALIZING CREDIT 


When the economy is in a position in which it is desirable to encourage in- 
ventory investment, monetary policy is limited to playing a passive role. The 
availability of cheap credit can facilitate inventory investment, but it cannot 
guarantee it. Because of this, the problems discussed in reference to an effort to 
inhibit inventories are not present when considering a policy aimed at encourag- 
ing the build-up of stocks. 

There is little doubt that an easy credit policy is appropriate when a boost 
in the economy is desired. The only question is the extent te which it can 
actually influence inventory investment. Restricting credit can curb inventory 
investment, but liberalizing credit cannot initiate inventory investment. An 
easy credit policy may therefore be considered as a necessary condition to en- 
courage inventory investment, but not as a sufficient one. 


PROPER TIMING 


Criticism of monetary management during the past decade has chiefly cen- 
tered on the problem of appropriate timing. This problem is particularly acute 
when contemplating the control of inventofy fluctuations. Inventory-sales 
ratios typically move countercyclically. Sales rise more rapidly during an ex- 
pansion and drop more during a contraction than inventories. It was not until 
after March 1951, when the volume of sales began to drop, that the inventory- 
sales ratio rose to a level which might have been considered excessively high. 
Action at that time would obviously have come too late. 

Coupled with this problem is the lag inherent between the time a credit re- 
striction is put into effect and the time this restriction is reflected in the inven- 
tory level. Industries which require a short processing period, and with a high 
level of purchased materials in relation to total inventories, will reflect the 
tightening of credit sooner than the other industries. However the durable 
goods industries, which are cyclically sensitive, generally have a long process- 
ing period and a low ratio of purchased materials to total inventories. A credit 
restraint will therefore take longer in its impact on inventories in the industries 
with which monetary authorites are most concerned. 


CONCLUSIONS 


The use of monetary policy to control inventory fluctuations presents a 
number of problems. These stem chiefly from the fact that a credit restriction 
aimed at reducing inventory fluctuations cannot be aimed exclusively at the 
industries or firms whose inventories are endangering the stability of the 
economy. 
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We found that: 

1. The industries which were cyclically sensitive did not borrow more or 
less in relation to their inventories than the other industries. 

2. Durable goods industries as a group had larger inventory cycles but 
borrowed less relative to inventories than the nondurables. 

3. The inventory level of large firms was not senistive to changes in credit 
conditions. 

4. Industries with large seasonal fluctuations might be unduly penalized by 
the higher cost of credit. 

5. Determining the proper time for a change in monetary policy would be 
very difficult. 


' 
t 


LEADING AMERICAN STATISTICIANS OF THE 
NINETEENTH CENTURY II 


J. FirzParrick 
The Catholic University of America 


This paper explores the statistical contributions of six leading Ameri- 
can statisticians of the nineteenth century. They are George Tucker 
(1775-1861), Edward Deering Mansfield (1801-1880), Joseph Camp 
Griffith Kennedy (1813-1887), Charles Williams Seaton (1831-1885), 
Robert Percival Porter (1852-1917), and Roland Post Falkner (1866— 
1940). 


N A recent article* the life and work of seven outstanding American statis- 
ticians of the nineteenth century were presented, revealing for the first time 

their pioneering efforts to advance statistical ideas and techniques in the 
United States. These seven statisticians were Lemuel Shattuck, founder of the 
American Statistical Association, and regarded by some as “the most influential 
American statist”; Dr. Edward Jarvis, an outstanding authority in American 
vital statistics, and the third president of the American Statistical Association 
(1852-82); James D. B. De Bow, head of the short-lived Bureau of Statistics 
of the State of Louisiana (1848-52), and superintendent of the federal census 
in 1853; General Francis Amasa Walker, superintendent of both the 1870 and 
the 1880 federal censuses, and the fourth president of the American Statistical 
Association (1883-97); Richmond Mayo Smith, author of several leading 
statistical works, and the first professor to offer a statistical laboratory and 
seminar in any American university; Colonel Carroll Davidson Wright, the 
well-known chief of the Massachusetts Bureau of Statistics of Labor, the first 
Coramissioner of the United States Bureau of Labor, and the fifth president of 
the American Statistical Association (1897-1907); and Charles F. Pidgin, chief 
clerk from 1873 to 1903 and director of the Massachusetts Bureau of Statistics 
of Labor from 1903 to 1907, an inventor of arithmetical and calculating ma- 
chines, and one whom Mayo Smith has called “the best practical statistician in 
America.” 

While exploring for statistical contributions of these seven statisticians, sta- 
tistical activities of six other leading American statisticians intruded upon the 
field of vision and seemed to require some treatment too. As a result the life 
and labors of these six statisticians are presented in the current article. In this 
way students of American statistical thought may become acquainted with their 
contributions to the field of statistics. These six statisticians deserving of rec- 
ognition are George Tucker, probably the first American professor to give in- 
struction in statistics, and author of the first American statistical text; Ed- 
ward D. Mansfield, the first Commissioner of Statistics for Ohio; Joseph C. G. 
Kennedy, superintendent of both the 1850 and the 1860 federal censuses; 
Charles W. Seaton, the inventor of the first tallying machine, which Was used in 
the 1870 federal census, and who succeeded Francis A. Walker as superintend- 


* FitzPatrick, Paul J., “Leading American Statisticians in the Nineteenth Century,” Journal of the American 


Statistical Association, September 1957, Volume 52, pp. 301-21. 
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ent of the 1880 census; Robert P. Porter, superintendent of the 1890 federal 
census who introduced a wider and more intensive use of the electric tabulating 
machine invented at the census by Hollerith; and Roland P. Falkner, the first 
American professor to devote full time to the teaching of statistics, who di- 
rected for the United States Senate Committee on Finance the most exhaus- 
tive investigation of prices and wages in the United States up to that time, 
known as the Aldrich Reports. 


GEORGE TUCKER (1775-1861) 


George Tucker, biographer, economist, geographer, historian, lawyer, 
novelist, statesman, statistician, and versatile polymath, was an original 
thinker in the field of statistics. He was the first American professor to perceive 
the desirability of some statistical instruction and to offer it [12]. He was not 
only the first teacher of economics at the Univeristy of Virginia, but also the 
first one to teach statistics there. He was also the first to observe the need of a 
national statistical society (the American Statistical Association founded in 
1839, was for the first sixty years more or less a local Boston society in spite of 
its title). He was the first to advocate the inclusion of statistics along with 
economics as“a section in the American Association for the Advancement of 
Science. He is the author of a famous work, Progress of the United States in 
Population and Wealth in Fifty Years (1843), well regarded as a statistical work. 

Tucker was a pioneer in this country in advocating the use of statistics to 
place economics on a sound scientific basis. This may be observed in his 
Progress of the United States in Population and Wealth in Fifty Years, published 
by Cary and Hart of Philadelphia, Frank Taylor of Washington, D. C., and 
Little and Brown of Boston, which contained 211 pages, embracing twenty-one 
chapters. It had previously been published in installments in Hunt’s Merchant’s 
Magazine, from July 1842 to December 1843 (Volumes VII, VIII, and IX). 
This work was written, as Tucker states in the preface, because: 

“The writer of the following pages being desirous of further gratifying the curi- 
osity he had always felt on the subject of the census of the United States, was induced 
to make a thorough analysis of it from 1790 to 1840. The result of his inquiries decided 
him on giving them to the public. They have conducted him to important inferences 
on the subjects of the probabilities of life, the proportion between the sexes, emigra- 
tion, the diversities between the two races. which compose our population, the 
progress of Slavery, the progress of productive industry.” 


Some are inclined to regard this book as probably our first American statistics 
textbook. Walter Willcox of Cornell stated that this work was ‘‘the most im- 
portant American book on statistics to appear in the first half of the nineteenth 
century.” He added that Tucker “displayed remarkable insight in utilizing 
scanty census materials” [13], [34]. 

Another volume by Tucker, The Theory of Money and Banks Investigated 
(1839), published by Little and Brown of Boston, makes considerable use of 
statistics. Still another work by Tucker is The Laws of Wages, Profits and Rent 
Investigated (1837), published by Cary and Hart of Philadelphia. 

Tucker was the first American to favor a national statistical society in this 
country, and he advocated in an article in December 1847 the formation of a 
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“General Statistical Society for the United States,’ but in this he met with no 
success. Then he advocated in September 1848 that a “Section of Statistics and 
Political Economy” be established in the American Association for the Advance- 
ment of Science at its first meeting held in Philadelphia, but again he met with 
no success [14]. 

Tucker is highly regarded as one of the leading scholars of his time. Fetter 
described him as “perhaps the most original and thoughtful of the economists 
of this period, though not popularly or widely influential in his day” [10]. 

Turner wrote: 


“George Tucker (1775-1861) of Virginia should be ranked among the strongest 
American Economists prior to the Civil War. Economists have not forgotten him; 
never knew kim. A few specialists in statistics remember his work in that field, but 
the students of economic thought make no mention of his theoretical writings. He 
was too far in advance of contemporary thought to be appreciated; and his works 
were out of print long before they could have been justly appraised. 

Tucker was a scholarly man who labored for the cause of science and a statesman 
whose power as a debater in Congress was widely recognized. His searching work on 
statistics and his American History of four volumes demonstrate strong ability in 
analysis and method. His pleasing style was supported by a careful organization of 
facts.” [31] 


Similarly, Helderman wrote: 


“During the period of his professorship at the University, he published a two- 
volume life of Thomas Jefferson, a treatise on political economy, a volume on money 
and banking, a statistical analysis of the census returns to 1840, collaborated with 
English scholars on a book descriptive of America, and geve significant addresses 


before learned societies. Moreover, his retirement and advancing years did not spell 
sterility. At Philadelphia, he continued his writing, publishing a four-volume history 
of the United States, a second treatise on political economy, a penetrating discussion 
of the banking situation after the panic of 1857, and a final volume of collected 
essays.” [19] 


Tucker was born in the town of St. George’s, Bermuda, and was sent at the 
age of 12 to Williamsburg, Va., to live with an uncle, Judge St. George Tucker 
who was professor of law at the College of William and Mary [4, Vol. 6, 175; 7, 
Vol. 19, 28-30; 9, Vol. 15, 126-127]. Upon graduating with an A.B. from 
that institution in 1797, he studied law under this uncle and then practiced law 
at Richmond, in Pittsylvania Court House, now known as Chatham, and 
Lynchburg. He first served in the Virginia legislature and later was elected a 
congressman for three successive terms (1819-1825). His skill as a debater and 
a constitutional lawyer attracted the attention of many leading men on his day, 
particularly Madison and Jefferson [22, Vol. 7, 384; 28, Vol. 7, 521]. He be- 
came a professor of moral philosophy at the University of Virginia when it 
opened its doors in 1825 [1, 5, 6, 24], and was chosen the first chairman of the 
faculty. He taught for twenty years, retiring in 1845 at the age of 70 [15]. He 
moved to Philadelphia, which he had always admired, and spend there the re- 
mainder of his life [8, 29, Vol. 3, 588]. 


EDWARD DEERING MANSFIELD (1801-1880) 


Edward Deering Mansfield, administrator, author, editor, and statistican, 
was the first Commissioner of Statistics for the State of Ohio, serving during 
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the years 1857-68. In his first annual report as Comnissioner of Statistics, he 
was quick to perceive and point out the real nature of statistics at a time when 
most statisticians were content with the “political arithmetic” aspect, namely, 
the mere compilation of numerical facts pertaining to the state. Mansfield is 
probably the first American to emphasize the important point of laws or gen- 
eralizations in statistics. He declared: 


“Statistics is the real basis. Its object is to ascertain both the facts and the laws 
of social movement. Its inquiries extend to the physical laws of man as a social 
being; to the resources of the country in which he lives; to the growth of society; to 
its labor and production; to its commerce, manufactures, and arts; to its property 
and wealth .... 

“_.. no statistician has yet risen up to calculate and deduct the general laws of 
growth and decline. This is really the most valuable part of statistics,—that which 
points out, with unerring accuracy, the causes which advance or retard society. 
Heretofore, statisticians have been chiefly, like the geologists, and a short time since, 
the chemists, engaged in ascertaining the facts of their science, rather than the general 
and universal laws which govern it. The science of statistics, like the mechanics of 
the natural world, embraces two systems of principles: 1. The principles of man and 
society as a fixed existence, simply as beings; 2. The principles of man and society 
in movement .... It is the last which most interests the people. The principles 
which govern the movement of society can only be deduced by ascertaining all the 
elements of society, with exactness, and at successive periods of time.” [11} 


His second annual report on pages 38-40 contained a discussion as to the 
nature of “social statistics.” His fifth annual report on page 42 pointed out the 
significance of the work of Professor Henry, the first secretary of the Smith- 


sonian Institution in measuring the heights and weights of the soldiers in the 
Army of tho Potomac. And in his seventh annual report on page 46 he discusses 
an inquiry from Dr. Francis Lieber, a distinguished scholar, relative to the 
desirability of collecting political statistics of the State of Ohio. 

One writer commented that these “reports upon the condition of the State, 
materially and morally, are the best representation ever given of a territory of 
equal extent, and a population of equal numbers” [2]. Two other writers re- 
ported that “he was the author of some strong and intelligent reports as Com- 
missioner of Statistics” [16]. 

One interesting event in Mansfield’s early life took place in the summer of 
1826 when he and a friend, Benjamin Drake, undertook a house-to-house study 
of the city of Cincinnati with its population of 16,200 persons, in order to 
ascertain its status and its advantages and thus induce immigration to that 
city. This investigation, a booklet of 100 pages, entitled Cincinnati in 1826, a 
scarce item now, was partly financed by the city council by an appropriation 
of $75.00, and it was so well received that it was translated into German and 
also republished in England. Mansfield enjoyed this early statistical work, re- 
porting that “taking the census and taking statistics is and must be instructive 
ps amusing. Such work takes you into the very homes of the people” [25, 
17}. 

Mansfield was born in New Haven, Conn., and received his early education 
in Marietta and Cincinnati, Ohio, and later at the United States Military Acad- 
emy, where he gracuated as a second lieutenant of the army engineers, July 
1819 [4, Vol. 4, 195; 7, Vol. 12, 255-56; 22, Vol. 5, 348]. However, he declined 
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the commission, and while preparing for college at Farmington, Conn., he met 
Timothy Pitkin, who later became a prominent congressman from Connecticut 
and the author of A Statistical View of the United States of America. Mansfield 
entered the College of New Jersey (now Princeton University), graduating 
with an A.B. degree with high honors in 1822. After having studied law under 
Judge Gould in Litchfield, Conn., 1823-1825, he was admitted to the bar in 
1825 and practiced law in Connecticut until May 1826 when he moved to 
Cincinnati, practicing law there during the years 1826-1836. During 1836 he 
vas Professor of Constitutional Law and History at Cincinnati College where 
he came in contact with the Reverend William Holmes McGuffey, president 
of the institution, and professor of moral and intellectual philosophy, who 
later was to teach statistics at the University of Virginia, 1845-1857. Mansfield 
was active in the formation of Cincinnati College for Teachers. He served as 
editor, at times, of the Cincinnati Chronicle, 1835-1849, and of the Cincinnati 
Chronicle and Atlas, 1849-1852, as well as of the Cincinnati Gazette for certain 
years during the period 1853-1871. He also edited the Railroad Record from 
1853 to 1871. For many years he was a regular correspondent with the New 
York Times using the pen name, “A Veteran Observer.” He was, indeed, a pro- 
lific writer, and his writing were “recognized as the ablest and most reliable 
commentaries on current events contained in any publication”. He was well- 
known for his impartiality and fairness [28, Vol. 11, 206]. He was a cor- 
responding member of the American Geographical and Statistical Society, 
founded in New York City in 1851, of which Quetelet, the famous Belgian 
statistician, was an honorary member. Mansfield was also an associate of the 
Société Frangaise de Statistique Universelle. Princeton College gave him an 
honorary degree of M.A. in 1851, and Marietta College honored him with a 
LL.D. in 1854. 

Manstield was the author of The Political Grammar of the United States 
(1834) later changed to Political Manual, which was widely used; A Discourse 
on the Utility of Mathematics (1834); A Treatise on Constitutional Law (1835); 
The Legal Rights, Liabilities and Duties of Women (1845); The Life of General 
Winfield Scott (1846); A Popular and Authentic Life of Ulysses S. Grant (1868), 
and others. 


JOSEPH CAMP GRIFFITH KENNEDY (1813-1887) 


Joseph Camp Griffith Kennedy, administrator, lawyer, and statistician, who 
enjoys the distinction of being the first official to be designated as the super- 
intendent of the federal census, was deeply interested in advancing the work 
of better census-taking. In 1849 he was appointed secretary of the United 
States Census Board, created by an act of Congress, March 3, 1849, and con- 
sisting of the Secretary of State, who had previously been entrusted exclu- 
sively with the federal census, the Attorney-Genera!, and the Postmaster- 
General [22, Vol. 4, 512-513]. Kennedy served in that capacity from May 1, 
1849 to May 31, 1850, when the Secretary of the Interior, then in charge of the 
federal census, appointed him superintending clerk, to be better known as 
superintendent of the Seventh (1850) census. He served until March 1853 
when he resigned because of a change in administration. Franklin Pierce, the 
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incoming Democratic president, appointed James D. B. DeBow of Louisiana, 
March 18, 1853, under whose direction the work of the federal census was com- 
pleted. DeBow served until 1855. An act of Congress dated June 12, 1858, 
ordered that a digest of statistics of manufactures be prepared, and President 
Buchanan placed Kennedy in charge of this assignment, which was completed 
in December 1859. He remained as superintending clerk from January 1, 1860 
to May 31, 1860, when he was appointed superintendent of the Eighth (1860) 
Census, a position he held until June 1865 [20, 14, 17-18; 37, 40, 49-50]. 

In 1851 he was authorized by the federal government to visit Europe in the 
interest of statistical work, particularly that relating to the taking of the federal 
census, and to work for cheap postage. Visiting England, Belgium, France, 
Prussia, and Austria he held conferences with leading public officials, examined 
official statistics, and studied various methods of census taking. He strongly 
urged the serious need of comparability of census data. Of the several addresses 
delivered by him in Europe, one was given before the Section of Statistics of 
the British Association for the Advancement of Science. In this British address 
he explained the character of the data collected in the United States censuses 
and the usefulness of such data in the determination of public policy to promote 
the general welfare of the people. Moreover, in cooperation with Quetelet, 
Guizot, and Chevalier, Kennedy played a strong part in the organization of the 
First International Statistical Congress, held in Brussels in 1853, and he was 
a member of both the Second International Statistical Congress held in Paris 
in 1855 and the Fourth International Statistical Congress held in London in 
1860 [28, Vol. 11, 168-169]. 

Kennedy is the author of an informative 29-page pamphlet, Progress of 
Statistics, which he read before the American Geographical and Statistical 
Society at their annual meeting, December 1, 1859. It was later published in 
the society’s Journal in 1860 (pp. 92-110) under the revised title “The Origin 
and Progress of Statistics.” This paper reviewed the historical development of 
census-taking in Europe and the United States. 

Kennedy was born in Meadville, Pa., and received his education at, but did 
not graduate from, Allegheny College. He read law, and in a few years became 
the owner and the editor of two local newspapers, the Crawford (Pa.) Messen- 
ger, and the Venango (Pa.) Intelligencer. This venture turned out to be un- 
successful [4, Vol. 3, 517-518; 7, Vol. 10, 335-336; 23 ]. In 1849 he was appointed 
secretary of the United States Census Board. In 1851 he served as secretary 
of the United States Commission to the London World’s Fair, and in 1861 he 
was commissioner to the London International Exhibition. In 1865-66 he was 
an examiner of national banks under the Comptroller of the Currency. 

In 1866 he was honored by King Christian IX of Denmark by the presenta- 
tion of a gold medal in recognition of his work in the field of statistics. He was 
a member not only of statistical societies of England, France, Ireland, and 
other countries, but also of many other scientific societies. His alma mater, 
Allegheny College, honored him with the honorary degree of M.A. in 1852 
and again in 1867 with the degree of LL.D. After his retirement from govern- 
ment work, he practiced law, and also engaged in some real estate activities. 
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CHARLES WILLIAMS SEATON (1831-1885) 


Charles Williams Seaton, inventor and statistician, was superintendent of 
the 1880 federal census and the 1875 New York State census. He was the in- 
ventor of a tallying machine first used in the 1870 federal census and again in 
the 1880 federal census. He served as a division chief in the 1870 federal census 
under General Francis A. Walker, and as chief clerk under Walker in the 1880 
census [37, 67-68]. When Walker resigned to accept the presidency of Mas- 
sachusetts Institute of Technology, Seaton succeeded him as superintendent on 
November 4, 1881 and served until March 1885. Seaton was the author of the 
Compendium of the Twelfth Census in two volumes, published in 1883. Seaton 
was also the superintendent of the 1875 New York State Census. When re- 
porting on this census, the Secretary of the State of New York pointed out that 
“I was fortunate enough to secure the assistance of Mr. C. W. Seaton, a gentle- 
man who had assisted in digesting the Federal census of 1870, and who united 
in himself, in an eminent degree, the qualities which are most desirable for such 
duties” [30]. 

Seaton’s inventive ability was reflected by his tallying machine, which was 
first used late in the year 1872 for the compilation of the 1870 census—the first 
census to utilize a machine for tabulating data. This machine was extensively 
employed in the 1880 census. A picture and brief description of this machine 
may be seen in The Century Magazine [26; 20, 19, 25]. Such a machine was 
badly needed because the 1870 census, as well as the 1880 census, had become 
so extensive in its many aspects that hand tabulation, being inaccurate and 
expensive, could not handle this increasing volume of work. In later censuses, 
this machine was superseded by the electrical tabulation machine invented by 
Herman Hollerith. Seaton also invented in 1884 a matrix printing apparatus 
for census work. 

Seaton was born in Norfolk, N. Y. He received his early education in the 
academies at Champlain and Malone, N. Y. and graduated from Middle- 
bury College, Vermont, in 1857 [28, Vol. 12, 217-218]. He taught in academies 
at Monson, Mass., and Keeseville, N. Y. Early in the Civil War, Company 
F., First Vermont Sharpshooters, composed of some of his former students, 
was recruited by him, and he was commissioned its first lieutenant. Later he 
was made a captain. Resigning in the winter of 1863-64, he secured employ- 
ment as an agent in the pension department of the Sanitary Commission, and 
later he was made chief clerk of the United States Pension Office. 


ROBERT PERCIVAL PORTER (1852-1917) 


Robert Percival Porter, author, administrator, journalist, and statistician, 
was superintendent of the 1890 federal census, and he was instrumental in 
expediting the work of that census by a more intensive use of electrical tabulat- 
ing machines. 

On April 17, 1889, President Harrison appointed him superintendent of the 
1890 (Eleventh) federal census, a position he held until he resigned July 31, 
1893 to return to editorial work [37, 70, 74]. His position was taken over by 
Carroll D. Wright, the United States Commissioner of Labor, who completed 
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the census work in October 1897. As superintendent of the Eleventh Census, 
Porter increased considerably its efficiency by the wider and intensive use of 
the Hollerith electrical tabulating machines [21] and also considerably in- 
creased the scope of census data. 

Porter gives an interesting account of the difficulties involved in the setting 
up of an organization to take the federal census. In his testimony before a 
select committee of the House of Representatives, relating to a proposal to 
establish a permanent bureau of the census, which he so strongly favored, he 
indicated some of the considerable difficulties which a superintendent has to 
face when setting up an organization from scratch. ln part, he said: 

“When I was appointed I had nothing but one clerk and a messenger, and a desk 
with some white paper on it . . . . The difficulty comes in getting your force together, 
picking out your men. I was not able to get more than three of the old men from 
this city .... Then, knowing all the agents of the Tenth Census, I wrote asking 
them if they were prepared to take up the work again. Some were and some declined. 
. .. Some of them were dead and some in private business. I succeeded in getting 
one from Colorado. ... He had a good practice out there as a lawyer in Denver, 
where he had gone originally for his health. I could not pay him as much as he was 
making, but he was fond of statistical work and was desirous of again taking up the 
inquiry he had conducted in the Tenth Census. With these men we started up the 
organization” (21; 20, 27]. 


Some 120,000,000 copies were printed of 2,400 forms, in the preparation of 
which much care and thought had to be given if the census was to be a success. 
Porter also pointed out: 

“To guide us in getting up these blanks we had only a few scrapbooks that some 
one had had the forethought to use in saving some of the forms of blanks in the last 
census. He had taken them home, a few copies at a time, and put them into scrap- 
books. The Government had taken no care of these things in 1885, when the office 
was closed up. Some of them had been sold for waste paper, others had been burned, 
and others lost” [21; 20, 28]. 


Porter was born in Norwich, England, and received his early education there 
and at the King Edward VI School. Because of delicate health, he came to 
California in his early youth to live on his relatives’ farm, and thus continued 
his education in that state [7, Vol. 15, 100-101; 28, Vol. 12, 216-217]. His first 
job in 1872 was as a reporter of the Chicago Inter-Ocean, and he also engaged 
in various other newspaper activities. In 1880-81 he secured a position in the 
1880 (Tenth) federal census under General Francis A. Waiker, the super- 
intendent, and compiled reports relating to wealth, debt, taxation, and trans- 
portation. In 1882, he was appointed by President Arthur as a member of the 
United States Tariff Commission and played a considerable part in the fram- 
ing of the 1883 tariff law. In 1885 he became one of the editors of the Phila- 
delphia Press and in December 1887 he, together with Frank Hatton, founded 
the New York Press. 

In 1898 President McKinley appointed him special commissioner to Cuba 
and Puerto Rico. After 1900 he returned to journalism and also engaged in 
undertaking various economic studies. In 1904 he joined the staff of the London 
Times which sent him to Washington as chief correspondent. In 1909 he re- 
turned to England where he spent the remainder of his life. 
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Porter was the author of many works dealing with the tariff, municipal 
ownership, Cuba, Japan, and other subjects. With respect to statistics, he 
wrote “Eleventh Census” (September 1891 issue of Quarterly Publications of the 
American Statistical Association), “Eleventh U. 8. Census” (December 1894 
issue of the Journal of Royal Statistical Society), and “The Census of 1900” 
(December 1897 issue of the North American Review). 


ROLAND POST FALKNER (1866-1940) 


Roland Post Falkner, economist, editor, educator, and statistician, along 
with Davis R. Dewey of Massachusetts Institute of Technology, enjoys the 
distinction of being one of the first two American professors to have the word 
“statistics” in their professorial title, although Dewey had a higher rank, being 
an Assistant Professor of Economics and Statistics. Dewey, a well-known 
professor of statistics for more than forty years, was the first editor 1888-1906 
of the Journal of the American Statistical Association, at first called Quarterly 
Publications. Falkner probably enjoys another distinction, that of being the 
first American professor to devote full time to the teaching of statistics. Fur- 
thermore, Falkner, as statistician for the United States Senate Committee on 
Finance, directed the most exhaustive investigation of prices and wages in the 
United States up to that time. 

Falkner played a prominent part in the advancement of statistics in this 
country by his teaching, and his writings, and as statistician of the United 
States Senate Committee on Finance. As an instructor in accounting and 
statistics at the Wharton School of Finance and Economy at the University 
of Pennsylvania, he started teaching statistics in the autumn of 1888. In 1891 
he was made an associate professor of statistics, a rank he held until 1900 
when he resigned to become chief of the division of documents at the Library of 
Congress. He not only handled his regular course 9, Statistics, in the Wharton 
School, but beginning the academic year, 1894-85, he offered a number of 
graduate courses in statistics, such as Introduction to Statistics, Statistics of 
Economic Problems, Statistical Practice and Theory and History of Statistics, 
in the Department of Philosophy (graduate division) and continued to do so, 
with some slight changes in course titles, until he left Penn in 1900 [33]. 

Early in 1891 he published a 243-page translation of August Meitzen’s 
famous work, Geschichte, Theorie, und Technik der Statistik (Berlin, 1886) as two 
separate supplements (March and May 1891) in the Annals of the American 
Academy of Political and Social Science, bearing the similar title History, 
Theory and Techniques of Statistics. This translation was widely used as a text 
by a number of colleges and universities as revealed by their designations in the 
course description. 

While serving as statistician for the United States Senate Committee on 
Finance, he directed the investigation of prices and wages in the United States, 
the data being collected for him by the United States Department of Labor. 
The results of this great study, known as the Aldrich Reports, consist of three 
volumes devoted to Retail Prices and Wages, published in 1892, and four 
volumes to Wholesale Prices, Wages, and Transportation, published in 1893. 
This outstanding investigation, generally considered to be the most exhaustive 
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examination of the history of prices and wages in the United States, involved the 
tremendous collection of prices as far back as 1840, and served as a basis for 
the compilation of a wholesale price index number embracing 85 commodities 
for the years 1840-91, and 223 commodities for the years 1860-91 [27; 32]. 
The investigation and the resulting construction of a wholesale price index 
for this 52-year period was, indeed, a pioneering statistical project, in spite 
of some weaknesses. This index was the forerunner of the current United 
States Bureau of Labor Statistics Index of Wholesale Prices. Moreover, this 
study along with that of Carroll D. Wright, as Commissioner of Statistics of 
Labor of Massachusetts in 1884, represented “two important improvements in 
methods of presenting statistics of wages.” Furthermore, it was an American 
demonstration of the use of averages: while Falkner experimented with weighted 
averages, he replied on the simple average of relatives which both the London 
Economist index and the Sauerbeck index were then employing. 

Some years later the Department of Labor commissioned Falkner to bring 
this Senate Committee on Finance index up to date. The result was published 
in the United States Department of Labor Bulletin 27 in 1900, bringing the data 
down to 1899. 

Falkner served as a member of a special committee of the American Econom- 
ic Association, consisting of Richmond Mayo Smith, Chairman, Davis R. 
Dewey, Walter F. Willcox, and Carroll D. Wright, which studied ways and 
means of improving the federal census. Its report, a 516-page study, “The Fed- 
eral Census: Critical Essays by members of the American Economic Associa- - 
tion,’’ was published in March 1899 by the Association in its Publications (New 
Series No. 2). This report played no small part in influencing congress to 
establish a permanent Bureau of the Census in 1902. 

Falkner was born in Bridgeport, Conn. He received his early education in 
the public schools of Philadelphia, where his parents had moved in his early 
childhood, and he graduated from the Philadelphia Central High School. He 
enrolled in the Wharton School of Finance and Economy, now known as the 
Wharton School of Finance and Commerce, at the University of Pennsylvania, 
from which he received the bachelor’s degree in 1885 at the age of nineteen. He 
then went to Germany, like so many other promising young American scholars, 
where he studied at the Universities of Berlin and Halle, receiving the Ph.D. 
degree from the latter institution in 1888 at the age of twenty-two years [35; 
36; 3]. He spent the summer session of 1888 at the University of Leipzig. 

In 1889, when the American Academy of Political and Social Science was 
founded, he was named its first secretary and served until 1896 when he was 
elected vice-president. He resigned this office in 1898 because of the pressure of 
many other duties. He served as an associate editor of the Annals from the 
times of its first issue in 1890 until October 1895, when he succeeded Edmund 
J. James, the editor, who had resigned. Under Falkner’s editorship extending to 
1900, the Annals became one of America’s outstanding scholarly journals. 

He was granted a leave of absence in two academic years, 1891-92, and 
1892-93, when he served as statistician to the sub-committee to the Committee 
on Finance of the United States Senate beginning in the autumn of 1891. In 
the fall of 1892, he was appointed secretary of the American delegation to the 
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Internationsl Monetary Conference, held at Brussels, where he also acted as 
one of the secretaries of the Conference. 

In 1894 he was elected a member of the International Institute of Statistics, 
whose membership, limited to 200 persons, was drawn from all parts of the 
world. He served as vice-president of the American Economic Association, 
1896 to 1898. He was a member of many scientific societies, including the Amer- 
ican Statistical Association, American Economic Association, American Acad- 
emy of Political and Social Science, and American Association for the Advance- 
ment of Science. He attended the International Statistical Institute at Berne 
in 1895. 

In 1904 he was appointed Commissioner of Education for Puerto Rico and 
served until late in 1907 when he became statistician for the United States 
Immigration Commission, a post he held from 1908-11. In 1909 he was chair- 
man of the Commission of the United States to the Republic of Liberia. During 
1911-12 he was Assistant Director of the United States Census, and in 1913 he 
was a member of the Joint Land Commission of the United States and Panama. 
He joined the Alexander Hamilton Institute in 1915, serving as its editor, 
1915-23, and director of research, 1923-26. He joined the research staff of the 
National Industrial Conference Board in 1926 and remained with them until 
his death [18]. 

He was the author of many articles, including “The Development of the 
Census” (November 1898 issue of the Annals), and in the Quarterly Publication 
of the American Statistical Association “The Theory and Practice of Price 
Statistics” (June and September 1892) “Wage Statistics in Theory and Prac- 
tice” (June, 1899), and “Criminal Statistics” (September 1891). 

Falkner is included in this nineteenth century treatment of the history of 
statistics because most of his statistical contributions belong to this century. 
During the twentieth century he engaged in many other fields of endeavor. 


CONCLUSIONS 


Several features are revealed by this investigation. Of the six leading Ameri- 
can statisticians, four were college graduates, Tucker, Mansfield, Seaton, and 
Falkner, while one, Kennedy, attended college but did not graduate. Porter 
was the only one not to attend college. Falkner was the only person to have post- 
graduate training. Three had legal training, Tucker, Mansfield, and Kennedy. 
Three, Mansfield, Kennedy and Porter had previous editorial experience. Only 
two taught statistics, Tucker at the University of Virginia and Falkner at the 
University of Pennsylvania. Three were superintendents of the federal censuses. 
Seaton was conspicuous in developing a tallying machine for use in the 1870 
federal census and again in the 1880 federal census. This machine was super- 
seded by the electrical tabulation machine, invented by Herman Hollerith. In 
1884 Seaton developed a matrix printing apparatus for census work. Mansfield 
was Commissioner of Statistics for the State of Ohio from 1857 to 1868. Falkner 
was Assistant Director of the 1910 federal census during the years 1911-12. 

Tucker was outstanding among these six statisticians because he was the 
first American professor to perceive the need of some statistical instruction 
and to offer it, to observe the need of a national statistical society by urging the 
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formation of a “General Statistical Society for the United States,” and, when 
this effort failed, to propose the inclusion of statistics along with economics as 
a Section in the American Association for the Advancement of Science. This 
effort also failed. He wrote the first American statistics text, Progress in the 
United States in Population and Wealth in Fifty Years (1843), which is highly 
regarded. 

Falkner was the first American professor to devote full time to the teaching 
of statistics. He was the first American professor to have the word “statistics” 
in his title when appointed instructor in accounting and statistics in the autumn 
of 1888 at the University of Pennsylvania. He published in 1891 a 243-page 
translation of Meitzen’s famous work, Geschichte, Theorie, und Technik der 
Statistik (1886) which was widely used as an American text. Falkner, as statis- 
tician for the United States Senate Committee on Finance, directed from 1891 
to 1893 the most exhaustive investigation of prices and wages in the United 
States up to that time. 

Mansfield made a unique contribution to statistical thought in his 1857 
report as Commissioner of Statistics for the State of Ohio by emphasizing the 
need of formulating generalizations based on quantitative data. He was one 
of the few American statisticians of the nineteenth century te perceive the 
real need for drawing inferences. To him statistical methods were a valuable 
tool for drawing inferences from numerical data, not merely to describe the 
phenomena. In that same year he advocated a “simple and cheap plan of a 
bureau of statistics” for the State of Ohio. 
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RECTIFYING INSPECTION OF A CONTINUOUS OUTPUT* 


F. J. ANSCOMBE 
Princeton University 


The problem discussed is how to choose a plan of rectifying inspection 
for a continuous output, i.e. for a flow of produced articles not aggre- 
gated into lots. It is suggested that the choice of plan should be based 
on an analysis of the total cost of the operation, and should not depend 
on arbitrarily assigning an AOQL or other quality guarantee not di- 
rectly related to the total cost. Three types of inspection plan are 
considered, namely, no inspectio” at all, 100% inspection, and sampling 
inspection according to the type of plan proposed in 1943 by H. F. 
Dodge. Under certain assumptions, it is shown that an appropriate 
a near-optimum plan can easily be found. Two quantities, denoted by 
4 ' k and M, need to be assessed; the first is a cost factor, the second 
Sane measures the expected rareness of abrupt deteriorations in the quality 
of output. When these are specified, the appropriate plan is chosen 
according to very simple rules explained and summarized in §8. It is 
suggested that, strictly for the purpose of rectifying inspection (and 
not for any other purpose, such as process control), Dodge’s type of 
sampling inspection plan can hardly be improved on, unless conditions 
permit of “deferred sentencing,” the output being held in bond for a 
while it has passed the inspection point. The principal mathematical 

difficulty has been to determine the response of a Dodge plan to an 
ia abrupt change in quality of output. This is studied in the Appendix. 


1, INTRODUCTION 


NDUSTRIAL inspection procedures afford a good example of the tendency of 
statisticians towards mathematical development from an existing founda- 
t tion, in preference to thinking about the foundation itself. The subject has its 
origin in the work of H. F. Dodge, H. G. Romig, W. A. Shewhart, W. Bartky 
and their colleagues during the 1920’s (see [29] and the introduction to[ 24]). 
Apparently the foundation ideas were formulated at that time, including the 
following. 


(1) If inspection is carried out according to a precisely stated rule, the effectiveness 
of the procedure can be studied and made known. A well chosen rule strictly adhered 
to is likely to prove more satisfactory to all concerned than the exercise of free judg- 
ment on the part of the inspector. 

(2) If the results of inspection are recorded in a simple but inte‘ligible way, much 
valuable information may be obtained. 

(3) Inspection may serve various purposes. Sometimes its pritnary function is to 
show what the quality of the goods is, so that appropriate action can be taken. When 
the quality is seen to be peor, a lot may be rejected, or a production process may be 
stopped for readjustment. Such inspection may be said to be informative. On other 
occasions the primary purpose of the inspection is to ensure that quality is satisfac- 
tory. If quality is not satisfactory to start with, a large part or the whole of the goods 
may be inspected, all defective articles found being removed, corrected, or replaced 
by good ones. This latter sort of inspection may be said to be corrective or rectifying. 

(4) According to the purpose, ways can be found for expressing the protection or 
guarantee given by an inspection plan—consumer’s risk, lot tolerance, average out- 
going quality limit (AOQL), etc. 


* Paper prepared in connection with research supported by the Office of Naval Research. 
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Since Dodge and Romig published their studies of single and double sampling 
plans for inspection of lots [24], there has been much elaboration of detail. In 
the present paper, rectifying inspection of a continuous output (i.e. of a flow 
of produced articles not composed into lots) is considered. The original study 
was by Dodge [23], published in 1943. Various modifications of Dodge’s original 
plan have since been proposed, some aimed at making the inspection more 
flexible or economical, some aimed at brondening its scope to include process 
control or process acceptance. An excellent review of the literature up to 1955 
has been given by Bowker [21]. This, more than anything else, has stimulated 
the present paper. For a remarkable feature of the literature reviewed by Bow- 
ker is that (with one exception') Dodge’s formulation of the problem has not 
been questioned, namely, that the purpose of the inspection is to guarantee a 
stated AOQL with the least inspection effort. Or if the formulation has been 
questioned (as indeed it seems to be in Bowker’s last paragraph), nothing has 
been done about it. 


2. THE PURPOSE OF INSPECTION 


When one fairly considers the matter, it is not clear what bearing the AOQL 
has on rectifying inspection. The AOQL is a statistician’s guarantee, quoted be- 
cause it can be calculated easily, not a user’s requirement. No user of inspection, 
not corrupted by contact with statisticians, would ever think of setting himself 
an AOQL as a target. 

The reason why a manufacturer does any inspection is that he thinks that 
in the end it will save him money (or increase his profits, or otherwise benefit 
his enterprise). If poor quality did not matter, he would not go to the trouble 
and expense of guarding against it. The amount and kind of inspection that is 
worth while depends on the seriousness of passing material of inferior quality. 
What the manufacturer would like to do is to minimize the sum of two costs, 
(i) the cost of carrying out the inspection that he does, and (ii) the ultimate cost 
to him of any inferior material that slips through the inspection. The first of 
these costs is usually fairly easy to assess, at least roughly; the second may be 
rather more difficult, especially if good will is involved. (I have discussed this 
matter in some detail elsewhere [1].) If the data of a problem are inexact, a 
rather rough solution will be good enough. What is important is that we realize 
what the problem really is, and solve that problem as well as we can, instead 
of inventing a substitute problem that can be solved exactly but is irrelevant. 
Whenever a manufacturer decides that some inspection shall or shall not be 
carried out, he has in fact balanced the cost of the inspection against its effec- 
tiveness—against the ultimate consequences of not inspecting, or inspecting by 
a different amount. If inspection procedures are to be chosen rationally, the 
balancing of costs must be made explicit, even if numerical values are only 
quite rough. Various studies of sampling inspection from the point of view of 
minimizing expected total cost (or some dilution of this principle) are listed 


1 The exception is the study by Girshick and Rubin [11], which is concerned with process control as well as 
with rectification of output, and does break fresh ground. Further work in the same direction has since been done 
by Gregory [12]. 
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below in Part (a) of the bibliography.? Here we attempt to apply the principle 
to rectifying inspection of a continuous output. The treatment closely re- 
sembles the previous studies by Weibull [18] and me [1] of rectifying inspec- 
tion of lots. 

It has often been claimed that economic analysis of inspection is futile, 
because the choice of a good inspection plan depends on “unknown” economic 
parameters. (In this connection the recent paper by Horsnell [14], together with 
the published discussion, is interesting.) Since some intelligent guessing is 
needed in choosing a good plan, difficulties can arise when an inspection plan 
has to be agreed on by several parties. There may be occasions when the less 
said about economic analysis the better, though it may receive plenty of private 
thought. Bad (i.e. ordinary) accounting obscures matters, by emphasizing the 
cost of carrying out the inspection, while hiding the consequences of poor 
quality of output. While admitting all this, I see no reason to retract the con- 
tention that if inspection procedures are to be chosen rationally the balancing 
of costs must be made explicit. In the present context of rectifying continuous 
inspection, under certain simplifying assumptions, it appears that remarkably 
little economic or other information is of major importance for selecting a good 
inspection plan, but that small amount is vital. 


3. FORMULATION OF THE INSPECTION PROBLEM 


In order to make a start, let us consider the simplest possible assumptions 
that can be made regarding costs and the setting of the problem. We suppose 
that articles pass an inspection point one at a time, more or less in the order 
of their production. If inspected, the articles are classified as either “correct” 
or “defective.” We shall suppose that inspection, when performed, is always 
performed correctly, so that articles are never misclassified; and also that de- 
fectives cannot be picked out at a glance, but only by systematically perform- 
ing the operation of inspection on all articles presented. We shall suppose 
that the cost of inspecting the articles is proportional to the number inspected. 
(This assumption is likely to be roughly satisfied in practice if the amount of 
inspection performed does not vary greatly and unpredictably from day to day; 
otherwise it may need some modification.) We shall suppose that when a defec- 
tive is found during the inspection, it is immediately rectified or replaced by a 
correct article, the cost of doing this being the same eaci: time; and that a defec- 
tive that passes the inspection point in the uninspected part of the output 
causes the manufacturer to suffer an ultimate loss, the same for every defective 
passed. (The assumption can be weakened, without affecting our conclusions, 
by saying that the losses resulting from passed defectivesareindependent of each 
other and have a common probability distribution. It can happen in practice 
that losses are not independent, being higher when defectives are frequent than 
when rare, but the present assumption should often be a reasonably good ap- 
proximation.) 

It turns out that, under the above assumptions, the relative merits of alterna- 
tive inspection procedures depend on the ratio of two costs, (i) the cost of in- 


2 If the list of 18 titles seems long, it is but a minute proportion of all that has been published on sampling 
inspection since 1950. 


: 
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specting an article, (ii) the difference between the ultimate loss from passing 
a defective and the cost of replacing it at the time of inspection. by a correct 
article. Denoting this ratio by k, we introduce the following notation: 
Cost of inspecting an article =k cost-units. 
Cost of replacing* or rectifying a defective article found during the inspection =a 
cost-units. 
Ultimate expected loss due to passing a defective =1 +a cost-units. 


Since the value of a is irrelevant to the choice of inspection plan, it will be con- 
venient to suppose a=0. If in fact a is not zero, the expressions found below 
for the average total cost per article will need to be slightly modified, as will be 
explained. 

The relative merits of alternative inspection plans depend also on the way 
in which defectives occur in the output as it reaches the inspection point. We 
shall suppose that successive articles have independent chances p of being 
defective, where p either stays constant or varies slowly or makes an occa- 
sional abrupt change, and is in any case independent of the inspection. In 
choosing an inspection procedure we shall wish to know the relative frequency 
with which different values of p will be encountered, and also how often changes 
in p may be expected. Estimates of the behavior of p (which may or may not 
prove to be correct when the inspection plan is put into effect) will be based on 
past experience or on technical knowledge of the manufacturing process. Ali 
industrial enterprise is to some extent speculative, and it is not surprising if 
good guessing pays when an inspection plan is being installed, just as it does 
when other sorts of decisions are being made. Formally, estimates of the be- 
havior of p will be expressed in terms of subjective probabilities. 


4. CHOICE OF INSPECTION PLAN 


Whether inspection is profitable or not depends, for any particular value of 
p, on whether p is greater or less than k. For 1/p articles are inspected on the 
average, at a cost of k/p cost units, for each defective that is found, and re- 
placing a defective costs 1 cost-unit less than passing it. : 

If it were known for sure that p would never exceed k, the most profitable 
procedure would be not to inspect at all. If it were known for sure that p would 
never be less than k, the most profitable procedure would be to inspect the 
whole output. (In practice, both these statements may need modification, be- 
cause our assumptions may fail. Thus, if it is known that there is no inspection 
at all, the quality of the output may deteriorate, while if all the output is in- 
spected the inspection itself may not be well done. Let us nevertheless see 
where our assumptions lead. It should also be noted that we are ignoring the 
value of the information provided by an inspection plan, when proper records 
are kept. If such information is important, some small amount of inspection 


3 When a defective article cannot be rectified, we suppose that it is replaced. The idea is that the inspector 
has a stock-pike of correct articles at hand, and draws replacements from that. Then a is the cost of removing 
(scrapping) the defective article plus the cost of the replacing article. The assumption about replacement is made 
because it simplifies the discussion slightly. No doubt it would be more usual in practice for the inspector not to 
make any direct replacement but merely remove the defective articles that he found from the line. In that case 
the output would be reduced. A correct theoretical treatment would differ a little from that given here, but the 


numerical results would be almost the same if a were taken to be the cost of removing a defective article plus the 
cost of production of that article. 
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will be preferred to none, even if the quality of output is expected to be always 
good.) 

If we cannot be sure that p will be always less than k, or always greater than 
k, we shall desire an inspection plan such that if p happens to be constantly 
less than k as little inspection as possible will be done, while if p happens to be 
constantly greater than k as much inspection as possible will be done. (If p is 
close to k, it will not matter how much inspection is done.) 

Bowker, in indicating the desirable statistical properties of continuous sam- 
pling plans, lists this: “As quality deteriorates, the plan should require only 
enough inspection to make the AOQ approach the AOQL.” If indeed maintain- 
ing a given AOQL were the purpose of the inspection, this would be correct. 
But if the view adopted here is accepted, the most profitable procedure when 
the proportion of defectives is high is to inspect the whole output and make it 
perfect. 

Let us consider Dodge’s type of inspection procedure, which has two ad- 
justable parameters. I have changed Dodge’s notation f for the sampling frac- 
tion to 1/n, as that makes the ensuing formulas neater. 

THE DODGE PLAN. At the outset, inspect 100% of the units consecutively as 
produced and continue such inspection until ¢ units in succession are found clear of 
defects. When this happens, discontinue 100% inspection and inspect only every 
nth unit. If a sample unit is found defective, revert immediately to 100% inspection 


of succeeding units and continue until again i units in succession are found clear of 
defects. Correct or replace with good units all defective units found. 


Dodge stipulates that, during the periods of sampling inspection, sample 
units are selected one at a time from the flow of the product, in such a manner 
as to assure an unbiased sampie. No doubt what happens in practice is that 
the inspector takes approximately every nth article, varying the interval a 
little to guard against a nonrandom pattern of defectives. For the purpose of 
the calculations below it is assumed that defectives occur “at random,” as al- 
ready explained, and that exactly every nth article is inspected, during the 
sampling periods, i.e. the nth, the 2nth, the 3nth, . . . , articles after the article 
that terminated a period of 100% inspection. 

Another desirable property of continuous sampling plans suggested by 
Bowker is that the plan should provide for terminating inspection and shutting 
down the line when the quality deteriorates sufficiently. The absence from the 
Dodge plan of any specific criterion for shutting down production is pointed 
to as a drawback. The idea is that the inspection should serve not only as a 
control on output quality but also as a process control. Sometimes, no doubt, 
this double purpose can be served very well. A less extreme position has been 
taken by Murphy [27], who considers a simple warning rule designed to stim- 
ulate remedial action, without necessarily interrupting the inspection routine. 

Such process control is not considered in this paper, because it is not in all 
cases appropriate. Sometimes (perhaps not often, but certainly sometimes!) 
the manufacturer is unable to control the proportion of defectives in his output 
by adjusting the manufacturing process, because he just does not know what 
causes the defectives. Or it may be that the defectives are known to be due to 
some carelessness or inaccuracy in workmanship, but this is not remedied by 


‘ 
' 
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stopping the process and making an “adjustment” to the personnel. Sometimes 
it is more profitable for the manufacturer to produce with a rather high propor- 
tion of defectives, which must then be removed by inspection, than not to 
produce at all. Some simple device for recording the inspection observations 
(with discrimination between sampling and 100% inspection) will enable the 
management to take stock of the process from time to time, and to see whether 
improvements or overhaul are urgently needed. 


5. COST OF A DODGE PLAN WHEN Pp IS CONSTANT 

Consider first the average cost of applying a Dodge plan to a long stretch of 
output, if p is constant. Dodge showed that the average fraction F of the out- 
put inspected is given by 

1 
F = 
1 + (n — 

where g=1—p. The average cost of inspection per article produced is Fk, 
while the average loss from passing defectives, per article produced, is (1—F)p. 


The average total cost C of applying the inspection plan, per article produced, 
is therefore 


(1) 


C=pt tk — pF. (2) 


We should like F to be close to 0 if p<k, and close if 1 if p>k; and these two 
requirements are in conflict. If we expect that p will nearly always be less than 
k, it will be more important to satisfy the first requirement than the second; 
and the reverse if we expect p will nearly always exceed k. If values of p above 
and below k, in something like equal proportion, need to be reckoned with, we 
shall want F to be not far from $ when p=k; and we might make it a provisional 
working rule,‘ in choosing a plan, that F =} as exactly as possible when p=k, 
i.e. 


(n — 1)(11 — = 1, (3) 
or roughly (if n and 1/k are not too small) 
ki = log, n. (4) 
If this rule is adopted, F is approximately the following “logistic” function of p, 
1 
F + e~i(p—k) (5) 


Clearly, the rapidity of the transition from values of F near 0 to values near 1, 
as p goes from below to above k, increase with i (and so with n). 

Table 708 refers to the case k=0.05. Five plans are considered, with n=5, 
10, 20, 50, 100, and 7 chosen each time so that F =} nearly when p=k; and 
also three further plans are shown having F =} nearly when p=k. Effectively, 
values of 1000 C are tabulated against p. More exactly, what is tabulated is the 
excess of 1000 C over the least value it could have if we knew for sure whether 


4 Weibull [18] proposed a similar rule for rectifying inspection of lots. 
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TABLE 708 


AVERAGE TOTAL COST OF DODGE PLANS W4EN p IS 
CONSTANT AND k=0.05 


The minimum average cost, and the excess average costs for eight plans, per 1000 units 
of output, are shown. The first five plans have F =} nearly when p =k; the remaining three 
plans have F =} nearly when p=k. 


Excess average cost (=1000AC) 


average 
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p was less than k or greater. That least value is also shown; it is 1000 p if p<k 


and 1000 k if p>k. Denoting the excess by 1000AC, we have 
AC = (k — p)F if p < k, \ 
=(p—k)(1— Phifp>k. 


The excess may be regarded as the cost of our not knowing the value of p; deci- 
sion theorists term it the regret. It is the only component of the cost that can 
be varied by choice of the inspection plan; the rest of the cost is inherent in de- 
fective articles having been produced. We desire to minimize the expected 
value of the excess cost, for the distribution of values of p that we expect to 
encounter—or rather, minimize the sum of this expected cost and of the ex- 
pected cost due to changes in p, as discussed below. 

It is obvious that the larger n is [i being chosen always in accordance with 
(3) | the lower is the expected excess cost, when p is constant. We shall see that 
the possibility of abrupt changes in p limits the size of n that is advisable. Some 
idea of the effect of changing the value of F from 4 when p=k can be gathered 
by examining the plans having F =} at p=k. For p<k, AC is reduced a little; 
for p>k, AC is considerably increased. Unless values of p below k are thought to 
be far more probable than values above k, there is little to be said for making F 
different from } at p=k. 

The calculations for the plans with n = 10 are also shown graphically in Figure 
709. . 

The entries in Table 708 can easily be adapted (approximately) for any 


(6) 
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PROPORTION DEFECTIVE p 


Fie. 709. Average total cost of inspection plans per thousand units of output when p 
is constant and k =0.05. 
(I) No inspection. 
(II) Dodge plan with n=10, i=29; F=}4 when p=k. 
(III) Dodge plan with n=10, i=43; F=} when p=k. 
(IV) 100% inspection. 


other small value of k, as follows. If k is multiplied by a factor, divide the values 
of « shown by the same factor, leave n unchanged, and multiply the values of 
p and all the average cost figures by the factor. 

The above calculations are based on the assumption that the cost a of re- 
placing a defective article by a correct one is zero. If a is not zero, the minimum 
costs shown in Table 708 must be increased by adding 1000 ap to them; the ex- 
cess costs are unchanged. In the figure, the curves will be tilted upwards to 
the right. 

How may the above information about excess costs be summarized? For any 
given plan, the expected value of AC depends on the distribution of values of 
p, but [if (3) is satisfied] the dependence is not very marked. Widely different 
distributions for p will give nearly the same value for &(AC). This is because the 
graph of AC against p is fairly flat, especially for the smaller values of n. Values 
for &(AC) calculated on the assumption of a uniform distribution for p in (0, k) 
will indicate its approximate size for other distributions quite different from 
the uniform. By numerical integration of the entries in Table 708 for the uni- 
form distribution, the following has been obtained as an empirical formula: 


&(AC) = 0.3k/J/n. (7) 


Of course, &(AC) is not equal to this for all possible distributions for p, but if the 
distribution is well spread out, with nearly all the probability concentrated 
on values of p below 2k, the answer cannot be much different from the above. 
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6. COST OF A DODGE PLAN WHEN Pp VARIES 


Let us consider now what happens if, after maintaining the constant value 
~: for a considerable time, p changes abruptly and maintains the value p:. The 
long-run rate of total cost per article produced, C, will of course change, from 
C, to C2, say, but this is not the only effect of the change in p. Suppose p, is zero 
or very small, while p, is large (e.g. equal to 2k). Then most probably, at the 
moment when p changes, sampling inspection will be in operation. But this 
is an unusual state for the inspection to be in if p= pz, and a costly state, because 
_ humerous defectives will pass through. As soon as a defective has been found 
in the inspection, after the change in p occurs, the inspection rule will operate 
exactly as if this was the start of the output. It is the cost relating to the 
stretch of output from the point of change in p up to the first inspected defec- 
tive that is untypical of costs when p= p, constantly, and causes a sort of bias 
in the total cost of the inspection which we call the transition cost. 

Transition costs are studied in detail in the appendix. General formulas are 
derived, and some values are tabulated. If we knew the frequencies (per 
thousand articles produced, say) with which changes in p of all possible kinds 
would occur, we could use the results of the appendix to evaluate accurately 
the average transition costs (on the assumption that after every change p re- 
mained constant long enough for at least one defective to be observed, before 
the next change). Actually, we have not as much information as that. A com- 
plete probabilistic description of the process will most likely seem out of the 
question, and so a cruder calculation must suffice. 

Summing up very roughly, we may say that the largest transition costs arise 
when p; is small and p, is large, their magnitude is roughly proportional to n, 
and a typical value for such a cost may be taken to be $n cost-units. Now 
suppose we are able to estimate that such abrupt worsenings of quality may be 
expected to occur, on the average, about once in M articles produced. Then 
roughly the average transition cost, per article produced, is n/2M cost units. 
But we have just seen that the average excess cost of the inspection, when p 
is steady, is in the neighborhood of 0.3k/+/n cost-units per article produced. So 
we might reasonably choose n to minimize the sum 

n 0.3k 
2M 
i.e. take n= (0.3 kM)*/*. This formula should at least indicate the right order 
of magnitude for n. When n is chosen in this way, the average cost of transitions 
is (roughly) equal to one half the average excess cost defined in §5. 
* To illustrate the use of the formula, suppose that production is at the rate 
of 4000 articles per week, and it is thought that sudden deteriorations of qual- 
ity are likely to occur on the average once every two or three weeks. Then 


M =10,000, about. Suppose we have estimated that k=0.05. The formula indi- 
cates we should take 


n = (150)** = 28, roughly, 


or say n=25, a 4% sampling rate. Condition (4) now gives 


i 
i 
4 
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i = 20 log. 25 = 64, roughly, 


or say 7=60. If we had expected abrupt deteriorations only once every six 
months, conditions being otherwise the same, we should have had M = 100,000, 
n= (1500)** = 130, roughly, or say (to discount excessive optimism) n= 100; 
and then 7=20 log, 100=92, or say i=90. 


7. OTHER PLANS 


A feature of the Dodge plan, shared by the other continuous sampling 
plans reviewed by Bowker, is that once an article has passed the inspection 
point it is never recalled. If during a period of sampling inspection quality 
seems to deteriorate, 100% inspection is done on the following output. 

With this condition, it seems unlikely that a well-chosen Dodge plan can be 
improved on substantially by a more complicated procedure, if abrupt changes 
in quality are liable to occur. For transition costs will be large unless there is 
some intensive inspection after each defective observed, and so a good plan 
cannot be very different from a Dodge plan.§ 

If the condition about no recall can be relaxed, the “deferred sentencing” 
principle may be used. This was developed during World War II for the destruc- 
tive proof of ammunition (see [19]). In the present context, it would operate 
as follows. Primary sampling inspection is performed on the whole output, at a 
low sampling fraction, and after passing the inspection point all output is 
stored, in order, until it is released by the inspector. Output is released provided 
defectives are found only rarely in the primary inspection. Some criterion is 
used to pick out what appear to be clusters of defectives, indicating patches of 
poor quality. The part of the output represented by a “bad patch” in the pri- 
mary inspection record, together with some on either side, will be recalled from 
store and inspected more heavily. If the proportion of defectives seems to be 
greater than k, the product will be inspected 100%. Thus whether a particular 
short section of the output is released from store without further inspection, 
or brought back for reexamination, depends on the number of defectives found 
in the primary inspection of both earlier and later output, and judgment on it 
is therefore “deferred.” The recall feature keeps transition costs low, and per- 
mits a low sampling rate and therefore low excess costs as defined in §5. These 
advantages need to be weighed against the cost of storage, handling, and delay 
in release. 

The criterion originally proposed in [19] for picking out bad patches in the 
primary inspection record was very simple, namely, that some stated number 
of defectives (e.g. 5) had been found among not more than some stated num- 
ber of consecutive articles tested. A more flexible (and theoretically simpler) 
criterion has been explored by Page [28] and applied to several types of inspec- 
tion problem. 


5 The multilevel plans suggested hy Lieberman and Solomon [26] seem unpromising from this point of view. 
These authors’ discussion of “local stability” must be interpreted with caution, as they seem to assume that p is 
constant and do not investigate the response of their plans to a change in p. Even the tightened multilevel plans 
suggested by Derman, Littauer and Solomon [22], when the number of levels is unlimited, are open to the objection 
that a prolonged period of almost perfect production will almost certainly lead to an almost sero sampling rate, 
and then there is effectively no protection at all against deterioration in quality of output. 


712 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1958 


8. CONCLUSIONS 


We have seen how the total cost of inspecting a continuous output by a 
Dodge plan may be investigated; this total cost includes the ultimate loss re- 
sulting from bad material that passes undetected during periods of sampling 
inspection. The total cost will depend (and hence the selection of a good 
sampling plan will depend) on a number of factors, notably (i) the relative 
expensiveness of inspecting an article and (if it should be defective) replacing 
it, versus allowing a defective article to pass undetected, and (ii) the char- 
acteristics of the output that will be submitted for inspection, namely, the rela- 
tive frequency with which various qualities (proportions of defectives) will be 
encountered, and also the frequency with which changes in quality will occur. 
When the plan is first installed, these characteristics of the output can only be 
guessed at, but in some instances, at least, experience will permit of good guess- 
ing. If suitable records are kept, the characteristics of the actual output can 
be noted, and the inspection plan can be modified later if there seems to be 
good reason. 

The following rough and ready rules are suggested for installing a Dodge 
plan. First, estimate k (which is the reciprocal of the number of articles that 
can be inspected for the same cost as the difference betweei: the cost of letting 
a defective article go by and the cost of replacing it by a good one). If confidence 
is felt that the proportion of defectives in the output will practically never be 
appreciably higher than k, do no inspection at all. If confidence is felt that the 
proportion of defectives in the output will practically never be appreciably 
less than k, inspect the whole output. (These rules carry provisos, however, as 
explained above in §4.) 

If quality better than k, and also quality worse than k, may reasonably be 
expected, estimate M (which is such that an abrupt change from good quality 
to bad quality may be expected, on the average, about once in M articles 
produced). Then choose the Dodge plan with 


n/n =0.3kM, i = (100/k) log, n. 


(Calculate n first, rounding down rather than up. Then calculate 7, rounding 
down rather than up.) 

These rough and ready rules for choosing a Dodge plan can be improved on 
a little, if more detailed information concerning the output characteristics is 
available. A plan chosen by the above rules will at least be much better than a 
badly chosen Dodge plan might be. 

I surmise that a weil chosen Dodge plan can be bettered only a little by a 
more complicated type of plan, unless conditions permit plans of the “de- 
ferred sentencing” type to be considered. 

The principal assumptions on which the above findings are based (explained 
in detail in §§3 and 4) are (i) linear cost functions and (ii) perfect inspecting. 
These assumptions should often be near enough verified for the conclusions to 
be useful. But in cases where different assumptions are appropriate, an in- 
vestigation along similar lines could be attempted. The object of this paper is 
to illustrate an approach. 
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APPENDIX 

Transition costs with a Dodge plan.—Let the sequence of articles inspected be 
considered as broken up into segments, where the first segment comprises the 
inspected articles up to and including the first defective, and the mth segment 
runs from the article next following the (m— 1)th defective up to and including 


the mth defective. We call these inspection segments. We also speak of produc- 
tion segments, the mth of which consists of all units of output from the one 
next foliowing the (m—1)th inspected defective through the mth inspected 
defective. 

If the proportion defective p of the output is constant and positive, successive 
inspection segments are independent, and the chance that any one is of length 
sis gp (s=1, 2, . . .). Let x denote the length of a production segment, and y 
the total cost for the segment. More exactly, y will be taken to be the expected 
total cost for the segment, averaging over the chance distribution for the num- 
ber of defectives among the uninspected articles. If the length s of the inspec- 
tion segment does not exceed 7, we have for the corresponding production seg- 
ment 

y= ke, 


while if s>7 
y=ks+(n—1)(s— 
We find, on averaging over the distribution for s, 


1 + (n — k+(n-1 
« _ ( 
Pp 
The average long-run cost per unit of output C=&(y)/&(x), and so we find 
1 
= k — p)F where F = . 
C =p + (k — p)F where i+ 


C is the average cost per unit of output. The total cost does not increase at 
the constant rate C per article produced. During periods of 100% inspection, 
the total cost increases at the rate k per article produced, or (k—C) per article 
in excess of C. During the remaining periods, every nth article adds that much 
to the cost, while the uninspected articles add on the average p per article to 
the total cost, or (p—C) per article in excess of C. for any production segment, 
let z denote the excess of the total cost over the average for that quantity of 
output, i.e. 

z=y— Cx = (k — C)s if 8s <i, 
= (k — C)s + (p — C)(n — 1)(s — 0) if s 


We have &(z) =0. 


714 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1958 


Let X denote the total number of articles that have passed the inspection 
point from the outset of inspection, and let Y denote the associated total cost, 
when the inspection process is observed at some point; and (supposing always 
that p is constant) let Z=Y—CX. Then &(Z)=0, provided we observe the 
process at the termination of the mth segment, where m is chosen in advance, or 
(by Wald’s identity, see [20]) where m is chosen according to any well-defined 
sequential rule such that m has a finite mean value. But &(Z) does not vanish, in 
general, if the process is observed at an arbitrary preassigned value of X, say 
at X =N, where the Nth article produced is not necessarily an inspected defec- 
tive that terminates a segment. 

For given N, let the Nth article be in the mth production segment. Excep- 
tionally, the mth segment will terminate at this Nth article; otherwise it will 
terminate after a further z* units of output have passed the inspection point. 
Let y* denote the total cost for these z* articles, and z* = y* —Cz*. If N is large, 
the distributions for z*, y* and z* will not depend on N, the incidence of the 
Nth article being “at random” with respect to the inspection segments. Since 
the mean value of Z is zero at the termination of the mth segment (by Wald’s 
identity), 

X = N) = — &(2*), 


and this is therefore the average excess of the total cost Y over CX, if the 
reckoning is made at any arbitrary point. We denote this by B(p), the bias in 
Z when p is constant. 

So far we have assumed that p has been constant from the outset. Suppose 
now that p has the constant value p, up to and including the Nth article pro- 
duced (N large and arbitrary), and thereafter p has the constant value ps. Let 
z* and y* as previously defined ‘be now denoted by 2:* and y,:*, and let 2:.*= 
yi2* — C2212*. (The suffix 1 or 2 appended to C or F wiil denote that p; or pe is sub- 
substituted for p.) Let zu* denote z* when p has the constant value p; after as 
well as before the Nth article (i.e. p,=p,), and similarly for z.2.*. Then if we ob- 
serve the process at any arbitrary point before the change in p occurs, 


&(Z) = = — &(2n*), 
while if we observe it at any arbitrary point long after the change in p has 
occurred, 
&(Z) = — &(eu*) — 


(The sum of the first two terms on the right-hand side is the mean value of Z 
at the termination of the first segment after the change, while the last term 
is the increment due to observing at an arbitrary subsequent point.) We may 
therefore reckon the expected cost of the sudden transition from p; to p: to be 


L(pr, Po) = — &(z22*). 


We proceed now to find the first term on the right-hand side; the second will be 
obtained from it by substituting pe for p;. 

For p constantly equal to pi, let a non-negative integer — be associated 
with each unit of output, where {=0 if the unit is an inspected defective and so 


\ 
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terminates a segment, and otherwise ¢ is the ordinal number of the unit in the 
production segment to which it belongs. In a long run of output, the propor- 
tion of units having a particular value of ¢ is 


P./&(2), 
where P, is the chance that a production segment is of length z. Now 
St), Pitm = (r 2 0), 


and P,=0 if z—7 is positive and not divisible by n. Hence the frequency of any 
value of é is 


pF, 


We shall assume that the value of & associated with the Nth article has the 
above chance distribution (this making precise the requirement that N is 
arbitrary). The symbol é will now refer to the Nth article, which is the last in 
the output before the change in p. 

If = =0 and so =0. For any positive let ¢ be the number of 
further articles that will be inspected, in addition to those already inspected 
when the Nth article has passed the inspection point, up to the termination of 


the current segment. The chance of any value of ¢(>1) is g2*“"pe. Given £ and ¢, 
if §<i we have 


= (k — ifi<i— & 
= ft 


Averaging over the distribution for t, we have 
&(zi2*| where 1 < < 4) = (k — C2)/p2 + — C2)(m — 


This formula is also good for ¢=0, since then the right-hand side vanishes. 
Given é and ¢, if let &=i+rn+p(r=0, 1, 2,.. .; p=0,1,...n—1). Then 


= (k — + — C2) {(n — 1)t — p}. 
Averaging over the distribution for t, we have 
where > i) = (k — + — C2) {(m — 1)/m — 9}. 
Averaging now over the distribution for =, we find 
&(zi2*) = (k — C2)/p2 + (p2 — C2)(1 — Fi) 


On introducing the expression for C2, we obtain the desired results: 
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TABLE 716 
TRANSITION COSTS FOR A DODGE PLAN (n=20, i=57) WHEN k=0.05 


Transition cost L(pi, p2) 
p= 0.00 0.01 0.03 0.05 0.07 0.10 0.16 0.32 1.00 
pi =0.00 0.0 -0.1 -—0.3 0.0 2.8 8.1 12.0 13.5 9.0 
0.01 0.1 0.0 -0.3 0.0 2.7 78: 24L.6 3138. $8.7 
0.03 0.3 0.3 0.0 0.0 2.2 6.6 
0.05 0.9 0.9 0.6 0.0 ga 4.3 6.5 7.2 4.8 
0.07 1.6 1.7 1.2 0.0 0.0 1.8 3.0 3.3. 2.2 
0.10 2.2 2.3 1.7 0.0 -0.8 0.0 0.6 0.6 0.4 
0.16 2.4 2.5 1.9 0.0 -1.1 —0.5 0.0 0.0 0.0 
1.00 2.7 2.7 2.0 0.0 -1.1 —0.5 0.0 0.0 0.0 


Lim») = 2) 


+(1-F) (2) + +(2)) -(- Paik 


= — — Fi) F, (»(— =) + i) 


These results have been derived on the assumption that neither p; nor p2 
vanishes. Limiting expressions when one or other tends to zero can be obtained, 
as follows: 


B(O) = k(é — 4)(1 — 1/n). 


These expressions should be regarded as approximations valid when p; or p2 
is close, but not equal, to 0. If p=0 exactly, 100% inspection is not a recurrent 
state of the inspection process, and the results need modification. Similarly in 
Table 716, 0 for p; or p, means 0+. 

As for tendency to 1, we have: 


B(1) = 0. 
L(1, pr) = B(pr2). 
L(m, 1) = — Fi)(1 — &). 
The greatest value of L(p:, p2) is attained when p,=0, p.=+/(2k), when 
L = (n— 1)(1 — Vk/2)?, 


provided that /:=1 effectively. Thus for k=0.1, 0.05, 0.02, the greatest values 
for L are respectively 0.60(n—1), 0.71(n—1), 0.81(n—1). 


>. 
1 
{ 
4 
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Table 716 gives some values for L(p,, p2) when k=0.05, for the plan n =20, 
1=57. This indicates well the general character of the function. Large transi- 
tion losses arise only when p, is small and p is large. If such a change in p, in- 
stead of being abrupt, is made in smaller steps through intermediate values, the 
total transition loss is much reduced. In particular, it will be relatively small if 
p changes very slowly and continuously. Transition losses for p,; large and pz 
small are all relatively small. Note that the bottom row of the table shows the 
bias function B(pze). 

The suggested value of 3n for a typical transition cost from good quality to 
bad, given above in §6, was derived by examining Table 716 and some tables 
for other plans, not reproduced here. Obviously it is only a crude estimate, but 
the M that it must be divided by will be still cruder, so there is little point in 
arguing about whether 0.4n or 0.6n (say) would be better. 


Mathematical note. Lieberman and Solomon [26] and Derman et al. [22] 
have made use of Markov chain theory. A rephrasing in similar terms of the 
problem solved above may be helpful. 

We may say that an article in the production fiow is in state Z; with respect 
to the inspection if it is the jth article produced since the last previous inspected 
defective (j=1, 2, . . . ). When pis constant, the states of the successive articles 
produced constitute a Markov chain, such that for certain values of j the transi- 
tion from EZ; to £;4; is certain, while for all other values of 7 the transition from 
E; to FE, occurs with chance p, and from E; to Ej; with chance 1—p. The 
initial (say zeroth) state is Z,. Provided 0<p<1, the chain is irreducible, 
aperiodic and ergodic, and the stationary chances are easily found. 

Associated with the transition to any state Z; from the preceding state isa 
number w; depending on j, being the excess over C of the expected total cost 
for the preceding article. The sum of the ws for all transitions between two 
consecutive appearances of state EZ, is what is termed z above. The mean value 
of w with respect to the stationary chances is zero. 

The problem studied concerns the effect of initial conditions on the cumula- 
tive sum of the ws. If N and N’ are two arbitrary large fixed integers, if p=p,; 
constantly for the first N articles produced and p= pe constantly for the next 
N’ articles, the problem is to find the expected value of the sum of the ws for 
all transitions between the Nth and (V+N’)th states in the chain. The method 
used is an ad hoc one, depending on a peculiarity of the transition matrix, 
namely that from E; the only possible transitions are to E;,, or (for some values 
of j) to the special state Z,. Wald’s identity is used to pinpoint returns to FZ. 
This method would no doubt become unwieldy for a chain having a more 
general type of transition matrix. Conversely, a general method of studying 
the effect of initial conditions for any Markov chain, based on evaluating all 
possible powers of the transition matrix, would be unwieldy in the present 
instance. 
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RANKING METHODS AND THE MEASUREMENT OF ATTITUDES 


R. JARDINE 
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An individual’s attitude to a psychological object may be measured 
by the total of his response scores over a set of nondichotomous items. 
Such a set of items is defined to be perfectly homogeneous if items agree 
perfectly in partially ranking a random sample of individuals, i.e., 
where R,; is the response score of the jth individual to the ith item, 
either Ryg>Rnj Or Rni<Rmj, for any pair of individuals 7, 7 and all 
items m. A coefficient, W, is defined as a measure of the extent of agree- 
ment of actual and perfect patterns of response, and an F test is given 
by which the null hypothesis of random association of scores with an 
individual may be rejected. 


1. INTRODUCTION 


HE concept of attitude in social psychology has its origin in the observation 

that individuals display patterns of behavior which are organized and 
oriented in relation to objects (concepts, situations, etc.) in such a consistent 
and sufficiently permanent manner as to endow thereby those objects with 
personal values. Such values are directly observable as varying in degree and 
may be qualitatively judged as positive/negative, favorable/unfavorable or on 
other such bipolar scales. 

Admitting the concept, the process of rendering it precise has led to the 
development of techniques for the measurement of an individual’s attitude to 
and object which, almost invariably, take as their data the set of responses of 
the individual to a list of items or questions, these latter being constructed so 
as to refer to the attitude or to some one of its dimensions. Usually, it is im- 
plicitly assumed that the individual’s verbal behavior under test is correlated 
with other behavior to which the attitude is relevant and which is of greater 
personal and social import. Furthermore, it may be noted that, although the 
phrase “measurement of attitudes” is very widely used, only ranking with re- 
spect to attitudes is ever actually achieved. 

More precisely, the general procedure being considered here involves the con- 
fronting of an individual with a set of items, each comprising a sequence of 
(k+1) mutually exclusive statements (or questions) and requiring assent to 
one such statement of each item, which thus receives a score 0, 1, 2, . . . or k. 
Each set of statements constitutes a scale for measuring the response of the 
individual to the content of an item, and the sequences are such that higher 
scores indicate a more strongly developed attitude in some consistent direction. 
Consequently, if the set of items is administered to a sample of individuals, & 
matrix of scores, [R;;], is obtained, where R,; is the score of the jth individual 
to the ith item, and the individuals are then distinguished by the total scores, 


Rj = Ris, 
i=l 


they achieve. 


720 
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However, before the obvious step can be taken of identifying this ranking 
of individuals by their total scores as a ranking in attitude, it is necessary to 
define what pattern of response, displayed by the matrix [R;;], indicates a set 
of items which are perfectly acceptable for this purpose of measuring an attitude 
and to provide and apply a statistical test which will reject an appropriate 
null hypothesis. Of the more recent techniques which have been advocated in 
this connection, Guttman’s [2] scalogram analysis is of particular interest, in 
that, unlike many alternatives, it goes some considerable distance towards 
fulfilling these requirements. Nevertheless, the procedure is extremely laborious, 
and a modified version of it devised by Bert F. Green [1] is considerably more 
acceptable from the point of view of calculation while not differing essentially 
from the original insofar as its psychological and statistical bases are concerned. 
However, Green’s procedure is restricted to dichotomous items (k=1), and 
since, in any case, neither of the above approaches is fully satisfactory statis- 
tically, it is of interest to note that this problem may be formulated for the 
general case (k>1), in terms of the partial rankings of individuals by items 
(and/or items by individuals), and that the appropriate statistical tests can be 
derived on the basis of the theorem due to Pitman which provides Kendall [3] 
with his test for the concordance of m rankings. 

To be specific, items can be said to agree perfectly in partially ranking in- 
dividuals if, for any pair of individuals 7, j, either Ras>Rn; or Rni<Rnj, for 
all items m. Clearly, this is the simplest adequate choice as to a pattern of 
response to be taken as designating a set of items which are perfectly accept- 
able. However, such perfect agreement is not attained in any actual case, and 
since, if items be wholly independent of each other, random association of 
scores with an individual is to be expected, a test for agreement as against this 
null hypothesis is required. Furthermore, such a test would be equally appli- 
cable to the alternative situation, where, individuals being said to agree per- 
fectly in partially ranking items if for any pair of items m, n either R,,;>R,,; 
or R,,;<R,; for all individuals 7, a test for agreement as against the null hy- 
pothesis of random assignment of scores by individuals is required. 

This test of significance is given in the sections immediately following, a 
general discussion of the procedure advocated here and its relation to those of 
Guttman and Green being relegated to the final section. It may be observed 
at this point, however, that, like Green’s procedure but unlike Guttman’s, 
there is no difficulty in applying the test when the number of items (or indivi- 
duals) is large. 


2. GENERAL CASE—SCORING 0, 1, 2,.... ork 


Each of m items is scored by each of n individuals 0, 1, 2, - - - or k and the 
scores are entered in an (m Xn) table. The items then agree perfectly in partially 
ranking the individuals if the columns of the table (individuals) can be per- 
muted into an order such that, in every row, the scores are nonincreasing from 
left to right as, for example, in Table 722. 

Each row of a table of data (not normally displaying such perfect agreement) 
can be separately permuted into this nonincreasing order. Let C be the sum of 
squares of deviations from the mean of the column totals of this rearranged 
table. 
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TABLE 722 
(m=5, n=6, scoring 0—4) 


Individuals 


3 4 


Form the coefficient W=A/C where A is the sum of squares of deviations 
from the mean of the actual column totals. W provides a measure of agreement 
and can vary from 0 to 1. 

Let d, be the sum of squares of deviations from the mean of the scores in 
the ith row of the table, calculate 


T= 


and then apply an F test for the significance of W where 


with degrees of freedom 2p and 2g. 

This test of significance is based on a theorem given in detail by Kendall 
[3, section 7.2] and may be outlined briefly here. The distribution of W in the 
population of (n!)™ possible rankings, given by permuting the scores within 
rows, is required. Now, 


where U is a double sum—the sum over all (7) pairs of rows, of the sum of 
products of deviations from the row means of the scores in a pair of rows. Ken- 
dall shows that H(U) =0 and E(U*)=X (in the notation of this paper), 7 and 
C, defined previously, and being constant under such permutations. Then 


| 
| 
1 2 FS 5 6 
‘f I 1 1 1 I 1 1 0 ; 
t 2 4 3 2 1 1 0 
e 3 3 3 3 2 1 0 
m 4 4 4 4 4 3 1 
m 2 m 
(Ea) - (La) 
2(n — 1) 
| T(CT — T? — 4X) 
? 
q 
\ 
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4X 
E(W) and BW —W)? = — 
and equating these to the corresponding moments of the beta distribution, with 
parameters p and gq, one gets— 
T 
Pg 4X 


which, being solved for p and q, give— 
‘ T(CT — T? — 4X) 


( 
Thus, W is approximately a beta variate with parameters p and gq as given 
above, so that 


is approximately an F variate with degrees of freedom 2p and 2g. 

In this application m and n will invariably be greater than is necessary to 
justify the use of the F test and significance at some level, say, 1%, may be 
demanded before agreement of the items be accepted. Furthermore, it may be 
desirable to require that W be greater than some value, selected by experience, 
so as to ensure a high level of agreement in the partial rankings of the individ- 
uals, for much the same reason that a correlation coefficient needs to be not 
merely significant, but large, if it is to be of much predictive use. A test for W 
in the nonnull case is not available, but this does not appear to be of importance, 
in practice, since there are no theoretical grounds for specifying any particular 
value. Provided, then, that the above criteria are satisfied, individuals may be 
assigned their total scores as measures of their attitudes and ranked accordingly. 

It may be noted that this test reduces to Kendall’s test for the concordance of 
m rankings in the case where there are no ties and the items are ranked 1 to n. 
In that case, 

n(n? — 1) mn(n* — 1) x m(m — 1)(n — 1)n?(n + 1)? 
288 
m*n(n? — 1) 


T 


Cc 
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C- 1 


3. SPECIAL CASE—SCORING 0 OR 1 


An important, frequently occurring case is when items ‘are dichotomous, 
since then the calculations simplify considerably. Each of m items is scored by 
each of n individuals 0 or 1 (yes/no, agree/disagree, etc.) and the scores are 
entered in an (mXn) table. {f M; items are scored 1 by exactly k individuals 
and m, for at least k, then, 

m, = M. n 
= Ma + M,, etc., 
m = 

If, then, there is perfect agreement of items in partially ranking individuals, 
the column totals in the table will be the numbers m, mz - - - , m, in some order. 
Let C be the sum of the squares of deviations from the mean of these numbers. 

Form the coefficient W = A/C, as before, where A is the sum of squares of 


deviations from the mean of the actual column totals. Then tabulate as in 
Table 724— 


TABLE 724 


1(n—1) 1*(n—1)? 
2(n —2) —2)? 


k(n—k) k*(n —k)? 


(n—1)1 (n—1)?*1? 


1 1 


Y(2Z + n(n — 1)CY — (n+ 1)¥?) 
2nC(¥? — Z) 
nC — 
=( Y ~)p 
and apply an F test for the significance of W where 


with degrees of freedom 2p and 2g. 
With reference to the preceding section, d;=2(n—zx)/n when the ith row has 
z l’s and (n—z) 0’s, T=Y/n and 


Mas 
calculate 
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(Y? — Z) 
2n*(n — 1) 


4. DISCUSSION 


In the foregoing sections a statistical test is provided for the agreement of m 
partial rankings, and it may be applied to either the partial rankings of in- 
dividuals by items or to those of items by individuals. One may now consider, 
in more detail, the question of what criteria are appropriate to the substantiat- 
ing of the judgement that a set of items measures accurately a single attitude. 

In the first place, it is evident that, since a ranking of individuals is required, 
the demand for agreement of items in partially ranking individuals is obligatory. 
However, this condition alone does not justify the desired inference, for (a) 
the items may be too similar, different expressions of the same or similar con- 
tent, and replication is then inadequate, or (b) the items may tap several 
closely correlated attitudes, and, while this would not matter so much from 
the point of view of description in a static population, it would be undesirable if 
attitudes were subject to change, for correlations need not then persist and a 
set of items formerly agreeing in partially ranking individuals might not con- 
tinue to do so. 

Now, while the necessary further conditions are largely satisfied by requiring 
a careful phrasing of items (and assuming this hereafter) so as to ensure refer- 
ence, as widely as possible, to a single object, one may also consider, in this 
connection, the relevance of the other partial ranking, of items by individuals. 

Clearly, the two partial rankings are on a different footing, for a ranking of 
items is not required in its own right, and to demand agreement of individuals 
in this respect would, therefore, be appropriate only if such agreement could be 
shown to be a necessary consequence of the assumptions as to the underlying 
attitude to which one is committed by the procedure laid down. Consideration, 
then, of this latter suggests that three assumptions are unavoidable, viz., 

(a) The individuals of the population being investigated are “intrinsically” 
measurable, in some sense, (x; for the jth individual), with respect to the 
attitude or a dimension of the attitude; 

(b) There are items, the responses to which by individuals of the popula- 
tion (apart from random errors of response), depend on z only, i.e., 
Ry=Ri(x)); 

(c) For any such item, R2,(z) is a non-decreasing function of 2; 

for there is nothing to measure if the first of these is not true, and if items do 
not fulfil the second and third they ought not to be in the selected set. 

While, then, agreement as to the partial ranking of individuals is a necessary 
condition to be satisfied by the items, there is evidently no such condition that 
the converse partial ranking should hold. Even so, it is conceivable that the 
items of the complete set appropriate to an attitude (i.e., the set of all items 
such that any R,(z) is a nondecreasing function of z only) differ only in rela- 
tively restricted ways. Such a set of functions R,(x) might, for example, con- 
stitute a family of non-intersecting curves (supposing responses to be con- 
tinuous), in which case agreement of individuals in partially ranking items 
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would also be expected, and if not achieved would be indicative of hetero- 
geneity. 

However, it would seem that one has no choice but to discount this latter 
possibility, for there appears to be no plausible psychological basis for any 
postulate further restricting the form of such response functions, and it is 
only reasonable then to adopt the principle that all items which agree in their 
partia) rankings of individuals are equivalent scales (and are suitable replicates) 
for the measurement of a single attitude. Nevertheless, although the empirical 
virtue of this assertion is obvious, it may be doubted whether the inference to a 
single attitude is strictly justified, or whether, therefore, McNemar [5, p. 
311], referring to Guttman’s procedure and so to the principle above, is entitled 
to consider it as “assuring that a single dimension is involved in the retained 
items.” This being so, it is likely that only in the long run can the validity of 
an attitude scale be recognized, and then by the pragmatic criteria of its 
successful predictive use and of continued agreement of rankings under condi- 
tions of changing attitudes. 

In consequence of this discussion, therefore, it is concluded that a procedure 
for the measurement of an attitude should involve the construction of a set of 
items, giving adequate replication and with k>1 when possible; and should 
require of these items that they agree in partially ranking a sample of individ- 
uals, the test of significance previously described being applied. Although it 
has been assumed throughout this paper that each item comprises the same 
number, (k+1), of statements, the test is in no way affected if items differ in 
their respective values of k. 

Now, as regards scale analysis, it may be noted that both Guttman and 
Green accept as sufficient this criterion that items are to agree in partially rank- 
ing individuals, and their procedures purport to test the matter. Guttman 
[2, p. 88] states—“A rank order of people is meaningful if, from the person’s 
rank order, one knows precisely his responses to each of the questions or acts in- 
cluded in the scale.” This is equivalent to the criterion as stated above, in that, 
if items agree perfectly in partially ranking individuals, then the m(k+1) state- 
ments, comprised by the m items, can be permuted into an order such that 
every individual displays a set of responses which is one of (mk+-1) possible 
types, these being distinguished by total scores ranging from 0 to mk and may 
be designated by selecting any one of the first (mk+1) statements of the se- 
quence, and then the next (m—1) in succession, allowing no repetitions of any 
item. Thus, given this ordered set of statements and an individual’s total score, 
his actual responses are determinable. All this does not, of course, entail any 
perfect partial ranking of items by individuals. 

Green, presumably, accepts the same criterion, although, by commencing 
with initially dichotomized items, the distinction between the partial rankings 
is obliterated. For when k=1, perfect agreement in one partial ranking entails 
perfect agreement in the other, even though this is not generally so when 
k>1. In fact, the greater k may be, the greater is the scope for disagreement in 
one partial ranking when there is perfect agreement in the other. Consequently, 
a possible error may be involved in the initial use of arbitrarily dichotomized 
items, for, if those same items, having response scales with k>1 are such 
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that one ranking shows perfect agreement and the other not, then dichotomiza- 
tion forces both rankings into perfect agreement. However, only the ranking 
which shows perfect agreement when k>1 will be independent of the points of 
dichotomy, whereas the other ranking will vary according to how these are 
selected. Thus, it is necessary that k be greater than 1 if the nature of the re- 
sponse pattern is to be fully appreciated. 

Finally, it may be observed that the general theory of psychological tests 
includes attitude scale construction as a special case. Thus items (or tests) may 
be verbal or nonverbal so long as an individual’s response to the content of an 
item is measurable on a preassigned scale, and the problem is then one of 
homogeneity, i.e., whether or not all the items of a set are measuring the same 
one underlying psychological variable. Loevinger [4] has developed a theory 
as to the homogeneity of a set of tests in the dichotomous case, and has dis- 
cussed its application to attitude scales. However, without considering the 
details of her theory, it may be noted that, in consequence of the formal 
identity of all special cases (attitudes, abilities, etc.) subsumed under the head- 
ing of psychological tests, the statistical procedure given earlier is applicable 
to any one of them and in the general nondichotomous case. 

Nevertheless, the measurement of an ability is worthwhile considering briefly, 
for, although the issues involved are identical with those already discussed, the 
concepts concerned are, perhaps, more sharply defined. Thus, individuals differ 
in ability, tests, in difficulty, and these latter may be scored 0, 1 (fail/pass) or 
0, 1,2, --- ork (increasing degree of success), response being a non-decreasing 
function of ability. Conceivably, now, two tests might be such that one pro- 
duces an “all or nothing” response (above or below a narrow range of abilities) 
while the other displays scores increasing more or less steadily with ability. 
Clearly, with a mixture of such types of tests (all agreeing perfectly in partially 
ranking individuals) individuals cannot agree perfectiy in partially ranking 
tests, and this raises, in a different context, the issue discussed at some length 
before. Is this indicative of more than one ability, of two dimensions of one 
ability, or what? The practical answer is probably that there is one ability, 
but such a decision settles nothing from a theoretical point of view. 

Loevinger, in the course of discussion, makes a distinction of some interest 
[4, p. 527] between cumulative homogeneous tests “when the items are ordered 
. .. (so that) each person scores plus up to a characteristic item and minus on 
all subsequent items” and differential homogeneous tests when “there is an 
order of items such that each individual scores minus up to a characteristic 
item, plus up to another characteristic item, and minus on subsequent items.” 

The type of item defined by the conditions laid down in this paper is cumula- 
tive, but it is of interest to note that the differential type can arise in two ways. 
Firstly, response may not be a nondecreasing function of the underlying variable 
and such items have been ruled out as undesirable (statistically, not necessarily 
psychologically). Secondly, a single statement may be selected from the set 
associated with an item (of the type defined previously)and then an individual 
would withhold assent if he were prepared to assent to a statement either higher 
or lower in the sequence. Thus, so far as that selected statement is concerned, 
response is not a nondecreasing function of the underlying variable. However, 
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this is avoidable, for any item can be reduced to a single statement by forming 
the logical disjunction (appropriately phrased) of all the statements after some 
point of the sequence, and the resulting item is then of the cumulative type. 

The main objection to differential type items would appear to be that they 
cannot agree in partially ranking individuals, and since this ranking is the 
primary desideratum, it is advisable to have it directly established as a matter 
of empirical fact, even if there be some doubt as to its precise meaning. 
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RANDOMIZATION TESTS FOR A MULTIVARIATE TWO- 
SAMPLE PROBLEM 


J. H. anv D. A. 8. 
University of Toronto 

With few observations involving a large number of variables the T? 
test for the multivariate two-sample problem may not exist. Some 
alternative tests based on randomization methods are suggested and 
two of these are applied to an example. Also, valid randomization tests 
can be obtained by using subgroups of permutations; this provides a 
simpie method for reducing computation which is desirable when the 
sample sizes are not small. 


INTRODUCTION 


FAMILIAR two-sample problem is to test whether two samples have come 
from identical populations. A frequently considered alternative is slippage 
—that the populations differ in location. Many tests for the case of observa- 
tions on a single variable may be found in the literature. However, for the case 
of observations on several variables there are only a few tests such as Hotel- 
ling’s T°-test for normal theory and the Wald-Wolfowitz modification using 
the T? statistic with permutations of observations [7]. These T?-tests require 
the inversion of a k Xk matric (k is the number of variables) and in consequence 
the labour of computation increases rapidly with k. Also, if k+2 exceeds the 
size of the combined samples, the matrix is singular and the test does not exist. 
Tests based on permutations of observations are non-parametric tests. For 
the two-sample problem they require no more than the basic assumption of the 
probliem—that under the null hypothesis the two samples behave as a single 
sample from a population. Actually, they require less—that, under the null 
hypothesis, the probability distribution is symmetric under all permutations 
of the observations. If the populations correspond to different “treatments,” 
this symmetry can be assured by randomly assigning the treatments to the 
experimental units. As a result, thess tests are often called randomization 
tests. For some examples and references, see Wald and Wolfowitz [7]. 

In this paper several randomization tests are proposed for the multivariate 
two sample problem. They were developed primarily for the normal-theory 
two-sample problem having no T?-test, but they are valid more generally; they 
are nonparametric tests. In Section 3 two of these tests are applied to an 
example involving 62 measurements on each of 12 people, 4 alcoholics and 8 
non-alcoholic controls, In Section 4 a method is proposed for avoiding the 
prohibitive amount of computation that is necessary if these tests are applied 
directly to large samples. 


THE TWO-SAMPLE PROBLEM 


Consider the two-sample problem and suppose that each observation provides 
measurements on k variables. Let the first sample, consisting of m observations, 
be designated by (x1, for j7=1, m, and let the second sample 
consisting of n observations be designated by (y1;, - - , forj=1,---,%, 
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We propose some randomization tests of the null hypothesis that the popula- 
tions are identical against the alternative hypothesis that there is a difference 
of “location” for some of the variables. 

When searching for good test statistics we often try to maximize power. 
Here, we do not consider power. Rather we choose statistics intuitively with a 
view to obtaining statistics that are sensitive towards the type of outcome to 
be expected under the alternative. For the randomization tests the distribution 
under the null hypothesis is in a sense always available, but the test statistic 
needs to be evaluated many times. As a result, the only property other than 
sensitivity that we shall consider is the ease with which the test statistic can be 
evaluated. 

First, we consider the distribution of a test statistic under the null hypothesis. 
If the two samples are from the “same population” the joint probability dis- 
tribution is symmetric under any permutation of the m-+n observations. Each 
permutation thus has the same probability which must therefore be 1/(m-+-n)!. 
Accordingly, the distribution of a test statistic is discrete and it has probability 
1/(m+n)! at each of the values of the statistic as obtained from the (m+n)! 
permutations of observations. 

The above distribution is really a conditional distribution. For, if we are 
given the information that there are a specific m-++-n observations in the com- 
bined sample, then the conditional distribution is concerned with the division 
of these observations into a “first sample” of m and a “second sample” of n. 
If we think of an ordered first sample of m and an ordered second sample of n, 
then the conditional distribution has equal probability 1/(m-+-n)! for each of 
the (m+n)! permutations. 

As an example, consider a first sample of m=2 and a second sample of n=1 
and suppose the combined sample contains the three observations { 1.7, 1.2, 
2.5}. Then the conditional probability is 1/6 for each of the 6 permutations: 
(1.7, 1.2; 2.5), (1.2, 1.7; 2.5), (1.7, 2.5, 1.2), (1.2, 2.5; 1.7), (2.5, 1.2; 1.7), (2.5, 
1.7; 1.2). 

For a distribution not under the null hypothesis the conditional probabilities 
will in general not be all equal. Ordinarily, there will be more probability for 
those permutations that produce a typical “outcome” of the alternative distribu- 
tion and less for the others. A randomization test can be obtained by taking 
any reasonable statistic and choosing a critical value on the basis of the condi- 
tional distribution above. If a test has a certain significance level conditionally, 
it has of course the same significance level with respect to the marginal dis- 
tribution. 

We now construct some test statistics that are manageable when k is large. 
Our approach will be to take a statistic suitable for the single variable case, 
apply it to each of the k variables and add the resulting expressions. To do more 
than this, we would seemingly have to take account of sample covariances, as 
for example in Hotelling’s T?, and this would require considerably more 
computation. 

For measuring slippage the absolute value of the difference in sample means is 
simple. However, to prevent gross differences in the scaling of the variables from 
unbalancing the sum function, it is reasonable to divide this by some scale 
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function such as the within-sample standard deviation or mean deviation. We 
then obtain the test statistic 


> | — 95 | (1) 


85 
where 
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and s; is the within-sample mean or standard deviation for the ith variable; 
the latter is given by 


8? = ————. — + (yu — 
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A similar statistic that would better emphasize a large value for one of the 
variables is 

> | — 
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If the variables are independent and if the variances are known and replaced 
by their estimates in (2), then the resulting test is most stringent for normal 
alternatives—among nonparametric tests; see Lehmann and Stein [3]. 

It is reasonable to have a test statistic independent of the order of the ob- 
servations in the first sample and independent of the order of the observations 
in the second sample. Such a test statistic need be evaluated only for the ("") 
different divisions of the m+n observations into a “first sample” of m and a 
“second sample” of n. The statistics just introduced are of this type. Even for 
samples of moderate size, however, this number of divisions ("{") can be 
very large. This weighs heavily against (1) and (2) because of the calculation 
of the s;. A considerable simplification can be obtained by using ranks or coded 
ranks for each variable. Let rj;, si; be the ranks of x;;, yi; among the values for 
the ith variable: 


{za, Lim, Ya, * * Yin}. 
A modified form of the statistic (1) is 


k 
By coding the ranks to have mean zero for each variable, 
m+n+1 mtn+1 


we obt:.in an equivalent but simpler statistic, 
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A modification of the statistic (2) is 


k m 2 
(5) 
\ 
We might prefer (5) to (4) by analogy with the usual analysis-of-variance test 
statistic. For the one-variable case (k=1) the tests based on (4) and (5) are 
just the Wilcoxon-Mann-Whitney test. 

For the one-variable case, the rank test locally most powerful for a normal 
alternative of slippage is obtained by recording ranks as shown below and using 
the difference in means as a test statistic; see Terry [5]. Suppose the m+n rank 
values 


1,2,---,m+n 


are replaced respectively by the mean values of the order statistics for a sample 
of m-++n from the standardized normal distribution, that is, by 


Ezqy, Ez, Ez min) 


where Zq), , Z¢m+n) designate the order statistic for a sample of m+n from 
the standardized normal. These are tabulated in [2], page 66 for samples up to 
size 30. Let ri;*, s:;* be the modified values for r;;, s;;; then the statistic (4) be- 
comes 


(6) 


and the statistic (5) becomes 


rt). (7) 


\ 


All of these test statistics should give reasonable tests against the alternative 
of a difference of location; they are consistent for the multivariate normal with 
a difference of location. The choice among them would be based on convenience, 
personal preference, or perhaps a Monte Carlo evaluation of power for some 
particular alternative distributions. 


AN EXAMPLE 


The researchers [1] have reported on a preliminary investigation of the 
metabolic characteristics of compulsive alcoholics (during non-drinking pe- 
riods) as opposed to those of persons who show no evidence of alcoholic tend- 
encies. A purpose of the experiment was to discover differences that could be 
subject to further investigation. Another was to reach a simple conclusion as to 
whether the data did or did not support the contention that compulsive drinkers 
have different metabolic characteristics. The statistical analysis has been 
criticized by Popham [4]. A careful survey of statistical techniques applicable 
to the problem has been made by Tukey [6]. 

In the experiment sixty-two metabolic characteristics were measured on four 
alcoholics and eight non-alcoholic controls. For each of the characteristics a two- 
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sample t-statistic was calculated and from it the corresponding probability level 
obtained from tables of the ¢-distribution (two-sided). It was observed that six 
of these sixty-two probability values were in the 5% zone, whereas approxi- 
mately three are expected when the characteristics are independent of drinking 
compulsion. The conclusion was made “that compulsive drinkers do have 
individualistic metabolic characteristics”. 

In [4] Popham criticized the conclusion on the grounds that the occurrence 
of six extreme probabilities was not significant when three of the six were ex- 
pected without individualistic metabolic characteristics. 

In [6] Tukey observes that the characteristics may not be independent of 
each other and that this would invalidate Popham’s quantitative evaluation 
of the probabilites. Tukey then discusses techniques of analysis both with and 
without the assumption of independence. 

The anticipated type of difference between populations is a shift of mean for 
some of the characteristics. The number of characteristics or variables being 
measured is large, k=64. This is the type of problem for which we proposed 
some randomization tests in the preceding section. 

Two of the tests were computed on a Ferranti computer, Ferut, at the Uni- 
versity of Toronto. To conserve machine time the two simplest tests where 
chosen, (4) and (6). 

The first sample (alcoholics) has m =4 and the second sample (non-alcoholics) 
has n=8. Given the m+n=12 observations, there are (7) =495 different ways 
of dividing these into a set of 4, thought of as a “first sample,” and a set of 8, 
thought of as a “second sample”. For each such division the value of the statis- 
tic (4) was calculated. The first value calculated corresponded to the observed 
division into first and second sample observations. This value was found to be 
at the 91.7% point of the 495 values in the conditional distribution. Using the 
statistics (6) the 93% point was obtained. Thus the results by either statistic 
are significant at the 10% level, but not at the 5% level. These probability 
levels are substantially the same as those obtained from a number of other 
tests by Professor Tukey. 

The computation took approximately one day for programming and two 
minutes of machine time. For larger samples the required machine time would 
increase very rapidly—faster than the function ("4") increases! 


FOR LARGE SAMPLES 


In the example the sample sizes were small, 4 and 8. If we increase the sample 
sizes the number of values of the statistic that need to be calculated, (”,"), in- 
creases very rapidly. Even for samples of the modest size 10, the number of 
values is in the neighbourhood of 190,000 and an exceptional computer would 
be needed. 

‘There is no essential reason why all the permutations or combinations need be 
used. Suppose we have some rules for permuting the sequence of m+n ob- 
servations. Successive application of these permutations may produce new 
permutations, but eventually there will be no new permutations produced. 
The resulting collection of different permutations of the sequence of m+n 
observations is than a group. 
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For the example let 1, 2, ---, 12 designate the twelve observations, the 
first sample contained the observations 1, 2, 3, 4 and the second sample con- 
tained the remaining observations. Consider the following permutations 


(1, 2, 3, 4; 5, 6, 7, 8, 9, 10, 11, 12) 
(5, 6, 7, 8; 9, 10, 11, 12, 1, 2, 3, 4) 
(9, 10, 11, 12; 1, 2, 3, 4, 5, 6, 7, 8). 


The second, as a permutation of the first, if applied twice produces the third. 
If any of these are applied repeatedly or successively, no new permutation is 
produced—they form a group of permutations. 

Suppose each of the above permutations is applied to the original sequence 
of 12 observations. Suppose then that we are given only the information that 
the observed permutation is one of the three. The conditional probability for 
each of the three, from symmetry, must be } under the null hypothesis. There 
is a corresponding distribution for the values of any test statistic. 

Consider now the statistic (6). Its values for the three permutations are 92.1, 
72.5, 80.5. The only reasonable significane level available is 333%. The observed 
value is 92.1; it is significant at the 333% level and at that level the null 
hypothesis would be rejected. 

Any group of permutations of the m+n observations may ae used to con- 
struct a test.t Given the information that the observed permutation is one of 
such a group, then under the null hypothesis the conditional probability for any 
particular permutation in the reciprocal of the number of permutations in the 
group. 

Consider the example again. For a significance level close to 5% we need 
more permutations. By pairing the observations successively from numbers 1 
to 12 and taking all permutations of the six pairs, we obtain 15 different “first 
samples”: 


(1, 2, 3, 4), (1, 2, 5, 6), (1, 2, 7, 8), (1, 2, 9, 10), (1, 2, 11, 12) 
(3, 4, 5, 6), (3, 4, 7, 8), (3, 4, 9, 10), (3, 4, 11, 12), (5, 6, A 8) 
(5, 6, 9, 10), (5, 6, 11, 12), (7, 8, 9, 10), (7, 8, 11, 12), (9, 10, 11, 12). 


Under the hypothesis each arrangement had equal probability of being the 
observed arrangement. The corresponding values of the statistics are 


92.1 64.7 79.5 67.5 88.0 
73.9 50.0 73.9 77.9 72.5 
89.4 70.7 84.9 59.7 80.5 


The observed value 92.1 is the largest and the hypothesis would be rejected 
at the 63% level. 


¢ Given the set of 12 observations, there are 12! possible outcomes for performing a full randomization test. 
This collection of possible outcomes can be partitioned in any way, and a conditional test performed, given the set 
in which the actual outcome occurs. If the numerical values of coordinates of observations are not used in forming 
the sets (and it seems reasonable not to use them), then the “possible outcomes” in a set must result from applying 
@ group of permutations to any one “possible outcome.” The reason is that after any permutation has been applied 
we obtain a “possible outcome” which is as entitled to receive a further permutation as was the original outcome. 
The collection of permutations so generated will necessarily form a group. 
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Note that if the tests are to be sensitive we must use permutations that mix 
up the original first and second sample observations. 

For the two-sample problem with larger samples, a group containing 100 to 
500 permutations would be reasonable. The labour of computation would 
then be roughly proportional to the total sample size. 


The methods in this paper extend simply to the r-sample problem, and also 
to any problem where the distribution has a symmetry under the hypothesis 
that is not found under the alternative. 
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A METHOD OF ADJUSTMENT FOR DEFECTIVE DATA 


Morris James SLoNIM 
Headquarters, United States Air Force 
AND 
Cuzster H. McCaut, Jr. 
George Washington University 

This article illustrates an original method of adjusting defective data 
when gross errors can be detected. The method entails an assumption 
concerning the distribution of total errors in order to estimate the effect 
of undetectable errors. While this type of situation may not be com- 
mon, it could occur in almost any data-gathering activity and could 
therefore have application in a wide variety of organizations. 


PROJECT in the Directorate of Statistical Services, Headquarters, U. 8. Air 
Force, concerned determination of fuel consumption rates for selected 
types and models of aircraft. Data obtained from official records revealed that 
fuel consumption rates, as a whole, had been understated ; however, the degree 
of error for each type and model was not known. The source of error was not 
known definitely, either, although it was suspected that data relating to gallons 
of fuel issued were understated. 

These data were at that time being entered in a special reporting form but 
the personnel assigned the responsibility of recording the data were not al- 
ways complying with the reporting requirement. At the same time, flying 
hours were apparently being reported correctly by the pilots. Such a situation 
would account for a general understatement of the fuel consumption rates, 
since these rates are expressed as gallons of fuel per hour flown. 

Essentially, each Air Force wing reported gallons of fuel issued and the num- 
ber of hours flown for all aircraft of a given type and model as a single entry. 
This consolidation had the effect of combining correct and incorrect rates and 
concealing gross errors in the fuel consumption rates of individual aircraft. 
Thus, it was not possible to identify understated fuel consumption rates by a 
study of existing reports. Examination of records of individual aircraft verified 
the assumption that some fuel issues had not been reported. Steps were initiated 
to institute a reporting system that would correct this situation. However, in 
order to obtain acceptable fuel consumption rates during the interim period, 
it was necessary to develop a method that would adjust the incorrect fuel 
consumption data effectively. Three adjustment methods were developed, one 
of which is described here. 

A world-wide sample of USAF units was inaugurated to obtain the number 
of gallons of fuel issued and the number of hours flown for each of the most re- 
cent six months for individual aircraft, by selected types and models. The data 
for each plane for each month were entered in a separate Hollerith punch card, 
and the fuel consumption rate for each was calculated and entered in the same 
card. A frequency distribution of fuel consumption rates was obtained for every 
type and model of aircraft. A mythical aircraft, the Z-99, together with ficti- 
tious data will be used to illustrate the method of adjustment of fuel consump- 
tion rates. 
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ADJUSTMENT FOR DEFECTIVE DATA 


TABLE 737 


Z-99 AIRCRAFT 
(FICTITIOUS DATA) 


Consumption Rate Number of Gals. of Fuel 
(Gals. Per Hour) 


0 
1-10 
11-20 
21-30 
31-40 
41-50 
51-60 
61-70 
71-80 
81-90 
91-100 
101-110 2,175,382 
111-120 764,088 
121-130 363 , 894 
131-140 69 , 402 
141-150 26,381 
151-160 14,727 
161-170 18,496 
171-180 10,880 
181-190 9,985 
191-200 6,242 


ToTaLs 105 ,324 9,740,001 


It appears feasible to select a cut-off point on the fuel consumption rate scale 
that will tend to isolate gross errors. For an actual aircraft type and model with 
a distribution generally similar to that of the mythical Z-99 it is known that the 
aircraft could not achieve sustained flight with a rate of 50 gallons per hour or 
less. It should be noted that while fuel consumption rates below the cut-off 
point represent gross errors, it is possible to have instances of very high con- 
sumption rates (for such reasons as excessive run-up time, delayed take-offs, 
electronic fuel control malfunctions, etc.). 

In our illustration, the 20 readings between 0 and 50 gallons/hr. are therefore 
gross errors.! These 20 gross errors account for 888 flying hours and 27,791 
gallons of fuel. When these amounts are deducted from the totals for all 2,140 in- 
dividual aircraft readings in the sample, the first approximation of an adjusted 
consumption rate (X’) becomes 93.00 gallons per hour compared with the un- 
adjusted rate of 92.48. 

It appeared reasonable next to assume that if the distribution contains a small 
number of obvious gross errors it also contains a larger number of smaller 
errors. Inspection of basic records on hours flown and fuel issues indicated that, 


1 Obviously, an element of judgment enters into the selection of the exact cut-off point for gross errors. In the 
actual study, the present method was applied using three separate cut-offs for each type and model. The method 
yielded approximately the same results for each cut-off value, indicating that it is, in a fashion, self-adjusting. 
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* ‘seale of average errors, first adjustment am), 

** ‘scale of average errors, second adjustment (") 


J. Tait Davis 
Fie. 738. Distribution of Errors in Fuel Consumption Rates 


in general, gross errors in the fuel consumption rates of individual aircraft in a 
given month were due to omission of two or more fuel issue entries in that 
month. The frequency of occurrence of single omissions in the same period can 
be expected to be greater, and these would result, for the most part, in relatively 
small errors. Such errors are not detectable because they have the effect of 
reducing reasonable consumption rates to rates which, though low, are still 
within the performance range of the specified type and model of aircraft. In- 
asmuch as the errors could be expected to occur at random with small errors 
occurring more frequently than large, the error distribution was assumed to 
follow the normal curve.* It should be noted that we are dealing with errors of 
omission,—that is to say, errors due to failure to enter fuel issue data in the 
reporting form. Thus we are concerned only with negative errors and are there- 
fore restricted to the left half of the error distribution curve (see Fig. 738). 

The number of gross average errors between C (the cut-off for gross errors) 
and X’ was earlier seen to be 20. It remains to estimate the number of “small” 
average errors between O and C (Fig. 738). To do this it is necessary first to 
obtain ¢ of the average error distribution and, from this, the value of the normal 
deviate at C. 

Regardless of the number of errors, it is known that the maximum average 
error occurs when the reported fuel consumption rate is O. This maximum aver- 
age error is equal to the true average fuel consumption rate X. The true average 


2 It is clear that the distribution of the errors in this type of situation is a matter of conjecture, hence the results 
yielded by such an assumption provide about the only measure of its validity. In the project at hand, end-of-the-year 
accounting data indicated that the fuel consumption rate revisions based on the method described here were highly 
accurate. 
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rate is unknown; however, the best estimate at this point is X¥’. Thus, X’ (in 
the present instance, 93.00) is our best estimate of the error range of the “minus” 
errors. The total range of the error distribution (if both halves were considered) 
would be 2 X’. L. H. C. Tippett’s table [3] of mean range of samples, given in 
terms of the normal deviate, was used to estimate o of the average error distri- 
bution. Although the value of the normal deviate in this table varies with the 
sample size, we have in the present problem a situation where the range is 
already at a maximum for the specified estimate of the average consumption 
rate. Therefore, the maximum normal deviate (approximately 6.5) was used 
to estimate o. In the present instance, the range was 2X 93.00, or 186.00, hence 
the first approximation of o was 


186 
= 28.62 
6.5 


The number of small errors between zero and the cut-off point on the error 
distribution curve can be estimated as illustrated below. 
93.00 — 50.00 


Normal deviate at cut-off point = = 1.50 
at cu poin 


Area from 0 to cut-off point 
= .8664 of total area under the left half of the normal curve. 


Hence the 20 gross errors between C and X’ comprise .133 of the errors, which 
total 149.7. Thus, the estimated number of small errors is 129.7. 

To measure the effect of these errors on the consumption rate it is necessary 
to estimate the average error value. This figure was obtained from the table of 
areas under the normal curve by finding the expected value of the small errors 
between O and C. The derivation of this expected value follows quite directly by 
applying laws of conditional probability and the calculus. It can further be 
shown that the average error approaches a maximum normal deviate of .7978 
as z approaches infinity. In the present case, the expected value of the normal 
deviate for the errors between O and C was .6219. Since we noted earlier that 
o was 28.62, the average amount of error is 17.80 gallons per hour for the unde- 
tected small errors. 

An estimate of the average hours flown by each of the aircraft represented by 
these 129.7 errors is the average hours flown by all aircraft in the residual dis- 
tribution (original distribution less gross errors). This is simply the quotient of 
hours flown by the number of aircraft. 

In the present instance, the figure is 49.26 hours per aircraft. 

The total number of hours flown by the aircraft representing the 129.7 errors 
is 6,389. Moreover, ‘ie estimate of average error for these errors was 17.80 
gallons per hour. Accordingly, the total understatement, in gallons, for these 
errors would be 112,724. 

Adding this to the total consumption figure (exclusive of gross errors) yields 
the second approximation for the fuel consumption rate (X”’), or 94.08 gallons 
per hour. 
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It is apparent that this second approximation widens the ranges of the aver- 
age errors distribution (Fig. 100). Consequently, the procedures are repeated 
to get a third approximation, X’”’. In the present instance this was 94.16 gallons 
per hour, which is not very different from X’’. Accordingly, the iterative process 
was halted with X’”. 

The authors have developed a symbolic representation of this iterative proc- 
ess. It was further apparent to the authors that the question of convergence 
of the normal deviate for the average error should be investigated. Due to 
the complex form of the iterative process, the usual tests yielded indetermi- 
nate solutions. To indicate the existence of convergence, the authors have shown 
that, for two hypothetical cases, an upper bound does exist in the approxima- 
tion process. This was accomplished by selecting as a first estimate a value 
larger than that arrived at by the iteration process. In both cases, the second 
approximation was smaller than the first estimate. Eventually, the same stable 
quantity was obtained, indicating an apparent convergence in the iteration 
process. 

In the project at hand, the number of separate fuel consumption rates under 
study was relatively small, and the calculations were easily performed by one 
clerk with a desk calculator. In all cases, differences between X” and X’”’ were 
small and calculations were not continued beyond the third adjusted estimate. 

Where the number of separate estimates is large, and in particular where the 
degree of adjustment desired might require several iterations, this procedure 
would led itself nicely to electronic data processing. 
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A TEST OF VARIANCES 


K. V. RAMACHANDRAN ; 
Demographic Training & Research Centre, Bombay 38 


A two sided test for the equality of variances from two normal 
populations is given which is not only completely unbiased but has 
also the stronger property of monotonicity. An example is given to 
illustrate the use of this test and the performance of this test is com- 
pared with that of the current equal tail area test. The case of one 
variance from a normal population is also considered. Tables are pro- 
vided for carrying out these tests. 


INTRODUCTION AND SUMMARY 


N THIS paper we prove that under a certain partition of the tail areas, the two 
I sided F test for the equality of variances from two univariate normal popula- 
tions and the two sided x? test for testing a variance from a univariate normal 
population have the monotonicity property. This property is that the power 
function of the test monotonically increases, as each of the parameters involved 
in the power function (to be called the deviation parameters) tends away from 
its value on the null hypothesis. This property has a considerable significance in 
terms of the loss functions of the general Wald theory. 

The weaker property of complete unbiasedness is automatically satisfied by a 
test possessing the monotonicity property. 

An example is given illustrating, in thecase of two populations, the use of the 
equal tail area test and the new test. The powers of the two tests are given in 
Table 300. 

The case of testing a variance from a normal population, which is based on 
the x? statistic, is also considered. 


THE PROBLEM 
Two populations 


Let 2; (¢=1, 2, - , and 22; ({=1, 2, - - - , n2) be independent samples 
of sizes n; and nz from normal populations with means y; and ye and variances 
o; and o;?, respectively. To test the hypothesis Ho: o;°=o;? against H: o20;? 
we have the test procedure: 

Accept Ho: o;? if 


Fi < F < (1) 


where 


— 1)? 
i=1 Ne 


(x23 — 


Pay” “yw 
t=1 


| 
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Fi 
a= p(F|1;m —1,m2—1)dF + (4) 


0 


p(F | — 1, — 1) = const. (F 


§? m—1 F 
1 awe 
[ — 1 4 


& = (¢;°/o2*) and a is the size of the test. The hypothesis is rejected if (1) is not 
satisfied. 
One of the current procedures is to choose F’; and F; such that 


Fi 
0 


If we do not impose the restriction (6), then the test procedure given in (1) 
depends on two quantities F; and F, which we can choose in an infinite number 
of ways such that the level of significance of the test is a. Equation (6) gives 
one partition of the tail areas. This is the equal tail area F test. 

Another procedure based on the Neyman-Pearson likelihood ratio method 
gives another partition of the tail areas which is as follows: 

Choose F; and F: such that 


Fi 
p(F|1;m — 1,2 — 1)dF + 
0 


and 
(nt+n2) /2 


It can be seen that unless n; =e, the test procedures given by (6) and (7) have 
power functions which do not have the monotonicity property and are actually 
biased. 

In this paper we shall choose the tail areas in such a way that the resulting 
test is locally unbiased and then show that in this case it also happens to 
possess the monotonicity property (and hence complete unbiasedness). The 
values of F;' and F,’ based on (18) for a=.05 are given in Table 743 for different 
values of m;—1 and n-—1. 


One population 

Let x; (¢=1, 2, - - - , m) bea random sample of size m from a normal popula- 
tion with mean 4; and variance o,’. To test the hypothesis Ho: o;°=0* against 
H: oo" we have the test procedure: 
accept Hp if 


S$ SS wi, (8) 


| 
(7) 
Gy 1+ m2 —1 
F, + 1 
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TABLE 743* 


VALUES OF F,’ AND F,’ FOR a=.05 AND FOR 
DIFFERENT VALUES OF n.—1, m—1 


8 
= 
8 
8 


es 
VE 
SSSR 


-27 


table are F:’. To i 


* The values given in 
with ne—1, m—1. 


ue of F;’ for m—1, ne—1, take the reciprocal of Fs’ 


where 


mi 


x 


o 
= Zi, (10) 
t=1 
13m — + f (11) 
t. 
p(x?| 8; m — 1) = (12) 


6? = (c,°/o*) and a is the size of the test. The hypothesis is rejected if (8) is not 
satisfied. 
The equal tail area test here is based on (8) subject to 


The 5% points of the new procedure based on (23) are given in Table 744. 


EXAMPLE 


Let s;°=5.8 and s=3.6 be sample variances based on m,—1=4 and 
n2—1=20 d.f. from two independent normal populations. We are interested to 
test Ho: whether the variances of the two populations are the same against the 
alternative H: that they are not the same. 
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TABLE 


VALUES OF x? AND x’? FOR a=.05 AND FOR 
DIFFERENT VALUES OF n—1 


1 xe"? 


2 
| 


27.26 
29.95 
32.61 
35.23 
37 .82 
40.39 
47 .96 
60 .32 
84.23 


- 


* For values of n1 —1 greater than 60, 71’? and 21’? can be calculated approximately from the equations: 
V2x,2 = V2m—3 — 1.96 and V¥2m 3 + 1.96. 


Now the equal tail area test is: 
accept H, if 


< ie., if 


8 
< 3.51, 


Fis < 
8.56 3.6 


where F, = 1/8.56 and F;=3.51 are the lower and upper points of the equal tail 
area F test of size .05 and with 4 and 20 df. [1, pp. 426-7, Table V]. 

The new procedure is: 
accept Hp if 


1 5.8 
ie., if ——_ < — < 3.98, 
7.16 3.6 


where F,’=1/7.16 and F,’=3.98 are the lower and upper points of the new 
procedure of size .05 and with 4 and 20 d.f. (Table 743 of this paper). 

It can be seen that the numerical difficulty in carrying out the two tests is 
the same but whereas the equal tail area test is biased (power of the equal tail 
areas test <a for some values of the deviation parameter) the one based on the 
new procedure is completely unbiased (see Table 744a). 


TABLE 744a 


POWER OF NEW TEST AND EQUAL TAIL AREA TEST FOR 38?=.1 
(.1) 1.6 WHEN n,—1=4 AND n—1=20 


3 4 5 8 | .9 | 1.0] 1.1] 1.2 


-230| .158| .113| .085| .068| .057| .052| .050| .052| .056/ . 


744 
9.53 14 5.95 
: 11.19 16 7.24 
12.80 18 8.58 
14.37 20 9.96 
15.90 22 11.36 
17.39 24 12.79 
18.86 30 17.21 
21.73 40 24.86 
24.52 60 40.93 
Fi’ s—<s Fy’, 
82? 
.2 1.3 | 1.4] 1.5] 1.6 
new test | .729| .397 062! .070} .080| .091 
Power of 
equal | .645| .322| .189| .121| .085| .064| .053| .048| .047| .050| .056| .065| .075| .087| .102| .117 
tail test 
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PROOF OF MONOTONICITY OF TESTS 
Test on variances based on two normal populations 
Let P denote the power of the test procedure, i.e., let 
F:’ 


p(F | m — 1, — 1)dF, 


1 
denote the power of the test procedure based on F;’ and F,’ where 


l-a=j, p(F| 1; m — 1, nm — 1)dF, (15) 


and = (¢;"/o,") #1 is the deviation parameter. We shall choose F,’ and F,’ 
such that in addition to (15) we also have 0P/ds?=0 when 6?=1, i.e., when 


Now 


+ mn — 1 =I (ny+n2—2) /2 
m—1 & 


is the power function of the test procedure. Notice that the constant is positive 
and is independent of &. 


Hence 


(=) 


§2 m—1 F,’ (ni+n2—2) /2 
& 


F,! (ny—1) /2 


(17) 


m—1 F,’ /2 
& 


The condition dP/d#=0 when &?=1 is equivalent to 


(nit+n2—2) /2 
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| 
op const. 
m — 1 
1+ 
1 
1 + ———F,’ 
— 1 
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Now (17) reduces to 

const. / m —1 


™m — 1 


1 + 


(=) 
2 1+ 2 


nol 


Hence using (18) we get (19) equal to 


const. {i m—1 
m—1 & 


5? 
m— 1 (ni+n2—2) /2 — rtr2)/2 
1+ P,’ 
—1 —-1 & 
m—1 m—1 F,’ 
1 — 


Now it is easy to check that 


<P >0 if > 1, thatis, if > os? 


and <0 if thatis, if o,? < o,’. 


Thus P, the power function of the test subject to (18) is a monotonic increasing 
function of & if &>1 and a monotonic decreasing function of 6 if &<1. Hence 
the new test based on F;’ and F,’ has the monotonicity property. 


Test on a variance based on a normal population 
In this case 


P=1- p(x?| m — 1)dx?, 
xi 


x? 


and 8? #1 is the deviation parameter. 

Here also if we choose x’;* and x’;? subject to (22) and dP/ds?=0 when &=1, 
when o,;’?=o", then proceeding as above the monotunicity of the new test here 
also could be proved. The condition 8P/d6?=0 when o,;*=c* reduces to 


42\ 


, 


. (20) 
where 
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CONCLUDING REMARKS 


The montonicity property of the corresponding tests based on multivariate 
normal populations has been proved in [2] under a certain condition. The tests 
in these situations are based on the largest and smallest characteristic roots of 
certain matrices [3, 4]. The percentage points for carrying out these procedures 
are being computed and will be offered for publication soon. 
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first article in JASA. His interests are experimental design, analysis of variance, quality 
control, sampling, regression and correlation, and application and development of sta- 
tistical methods. 
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sidered sufficiently important to warrant investigation. When the fuel consumption rates 
were adjusted by a method similar to that described in the present article, the discrepancy 
(for two successive years) was reduced to such an extent that the overall estimate from 
the adjusted factors attained an accuracy of around 99.5%.” 
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Introduction to Modern Statistics. Werner Z. Hirsch. New York: The Macmillan Com- 
pany, 1957. Pp. xiv, 429. $6.50. 


Frep M. Northwestern University 


fay textbook is written for the first statistics course in an under-graduate business 
or economics program. It is for students who do not know any calculus. In terms 
of its total coverage the book is not much different from the many others commonly 
used in such courses. It is in its method of presenting the subject-matter and in its 
emphasis on aspects other than the clerical ones that the book makes its contribution 
to the teaching of statistics. 

Obviously Hirsch has devoted much effort to make this a book which the typical 
undergraduate will find easy and enjoyable reading. Although I have not conducted 
an experiment, the book surely would score high on “readability” in the formulas of 
those who try to quantify these things. Much of the material on descriptive statistics, 
for example, is enlivened by developing it in story form. C. C. Camel, the regional 
sales representative of “Lucie Tastes Better” Tobacco Company (LUTABE for 
short) has practical problems to solve and decisions to make. We observe him studying 
his text, Elementary Statistics, by Professor Sigma P. Ekks and mulling over what he 
is learning. There are others in the plot: Mr. Paul Mal, Jim Kent, Mr. Viceroy and 
the appropriately-named president of the company, C. E. Wilson. All this leads to 
lots of questions, answers, and dialogue. Suggestive of the comic strip, it may be 
excellent pedagogy. To top it off, well-selected cartoons are reproduced here and there 
to elicit a chortle while teaching a lesson. 

From the very beginning the book stresses inference and decision making. Already 
on p. 18 one is introduced to confidence intervals. And after another forty pages or so 
comes probability theory and the start of the standard material on estimation and 
hypothesis-testing of means, proportions, differences of means, etc. In the intervening 
pages the student is taught the }--notation, round-off errors, and the meaning of and 
techniques for computing measures of central tendency and dispersion. An interesting 
innovation is to direct the reader to other elementary texts for a discussion of averages 
other than the mean, median and mode. Similarly the standard deviation and variance 
are the only measures of scatter presented. 

The treatment of estimation and testing is a model of clear textbook writing. 
Graphical techniques are effectively utilized to illuminate the ideas which beginners 
often find hard going. There are, however, a couple of matters to which the pros- 
pective user of the book must be alerted. Most important is the confusion arising in 
connection with the definition (p. 122) of the sample variance as s*=)_(x—m)*/(n-1), 
m standing for the sample mean. This definition is inconsistent with those of others in 
the book involving s. For example, right on the next page (as well as later in the book) 
s/./n—1 is used as the estimator for the standard error of the mean “for small 
samples.” To make matters worse, a problem two pages later gives the formula 
og=0//n—1. It appears that the simplest way for the student to fix this up in his 
copy of the book is to replace n—1 by n in the definition of s* and to correct the slip 
in the formula for cg by replacing n—1 by n. Another disturbing error appears in the 
section on hypothesis-testing for a proportion. To compute the significance level and 
power of the test (pp. 201-5), Hirsch incorrectly uses the observed sample proportions 
instead of the hypothesized population proportions in the formula for the standard 
error of the proportion. Erroneous also are the author’s attempts to define the maxi- 
mum-likelihood property in terms of unbiasedness and efficiency. And the summariz- 
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ing statement (p. 137), “A maximum-likelihood estimate is, under very general condi- 
tions, both unbiased and efficient,” is incorrect. It is hoped that these matters as well 
as other typographical errors can be corrected in the next printing. 

Following the block of chapters on testing and estimation comes a chapter on index 
numbers which, in addition to emphasizing the basic problems and explaining some 
of the formulas, gives a description of a number of the important price and quantity 
indexes published by various government agencies. This chapter is followed by a 
brief treatment of regression and correlation and a couple of chapters on time series 
analysis. Brief discussions of contingency tables and x? and of control charts complete 
the book except for a chapter that serves as postscript. This gives a glimpse of high- 
speed electronic computation and of some recent quantitative economic techniques, - 
like inter-industry input-output analysis and linear programming. 

This is not the sort of book which statistical practitioners will buy to keep on 
their reference shelf. But it is a book which instructors who believe that the elemen- 
tary statistics text need not be a dull book will wish to look at and try out. Interesting 
problems with a practical flavor stressing analysis rather than numerical work follow 
each of the chapters. 


Introduction to Statistical Inference. Jerome C. R. Ii. Ann Arbor, Michigan: Edwards 
Brothers, Inc., 1957. Pp. xiii, 553. $7.50. 


Dororuy C. Lowry, University of California, Berkeley 


_— appearance of Li’s book gave considerable pleasure to the reviewer, whose work 
as a statistician is with scientists in the fields of population genetics (primarily), 
physiology, and nutrition. Its author does not claim for it what I regard as its prin- 
cipal virtue: that if it is read attentively and questioningly by an experimental 
scientist, he will gain an insight into the possibilities and limitations of the experi- 
mental material which will enrich his relation to his work. He may also find himself 
seeking a statistician’s help in planning his experiments or, if he is one of the people 
for whom such help has been sometimes available, he will find himself better able to 
communicate with and understand the statistician. The book is a carefully organized 
development of the principles of statistical inference written in admirable language, 
precise and plain, which carries the reader along smoothly. It is not easy for a biologist 
to take lecture notes involving unfamiliar symbols and subscripts without some loss 
in his capacity to follow and weigh the line of argument. He can take the book at his 
own pace without that encumbrance. 

After several chapters on descriptive statistics, the normal distribution and a 
sampling experiment, which is referred to throughout the book as an empirical veri- 
fication of the distributions of various statistics, the basic concept of statistical infer- 
ence is developed and the ¢t, x*, and F distributions are introduced. The second part 
of the book is devoted to the analysis of variance and covariance; the third part, to 
sampling from binomial and multinomial distributions, transformations, and an intro- 
duction to distribution-free methods of analysis. There is a short review chapter be- 
tween parts one and two and another between parts two and three. There are tables 
for the distributions mentioned above as well as tables for other tests described. An 
index for the theorems stated, to which the author carefully appeals whenever it is 
necessary, is included, as well as one for subject matter, one for figures, and a table 
for symbols and abbreviations. The examples, with a lone exception to be referred 
to later, are illuminating but the questions at the end of each chapter are not par- 
ticularly thought-provoking. 

The appearance of the text is good. I was aware of only two or three transpositions 
of letters though there are two more serious errors: at the end of Chap. 11 in the 
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title of B. L. Welch’s paper “generation” has been printed for “generalization”; on 
p. 290, lines 12 through 20 belong at the bottom of the page. 

I think the following criticisms might be leveled: A little more discussion of models 
is needed, possibly including several not very obvious examples. Also the sentence 
“The test procedure for a mixed model is the same as that for the component of 
variance model” seems a rather serious oversimplification, since Li has emphasized 
that this model is commonly encountered by the scientist. The difficulties arising 
from unequal subclass numbers need more emphasis. The recommended procedure in 
the case of a test that two means are equal when the variances are unequal is not 
clear with respect to the degrees of freedom when N; = N,=N. Recommendations in 
two examples as to the test to be performed when the hypothesis of linearity of regres- 
sion has been rejected seem mutually contradictory. The student may wonder why, 
at the bottom of p. 422, a test that B =0 is performed when he has been emphatically 
told in a preceding chapter, 17.8 (2), that he is to use the treatment mean square 
against the error mean square when the regression is not linear. With respect to 
p. 423, surely line 17 should read “does not change,” since the F value is not sig- 
nificant. Li “accepts” an unrejected null hypothesis instead of merely concluding that 
the hypothesis is not inconsistent with the data. This usage is clearly a concession to 
the degree of simplicity he believes to be required, but its wisdom may be questioned. 
Lastly, there might well be a brief discussion of the source of the values of the in- 
dependent variable, z, and of several problems other than drawing inferences about 
the regression cf y on x which may be encountered. 

None of these criticisms is of much weight. On the other hand it would be hard 
for me to overstate my opinion of the worth of the vook for an experimental scientist 
who knows he is not going to take a course in statistics. The empirical verification of 
theorems about the distributions of many statistics by sampling experiments from a 
population of 500 observations normally and independently distributed is very effec- 
tive, especially as the sampling experiments develop in complexity. The general 
orientation of statistical inference about the analysis of variance, even in the chapter 
on distribution-free tests, has resulted in sections on linear regression and on co- 
variance which seem more capable of being grasped by the beginner than is usually 
the case. Several things which I believe to be new and of great value in this book are, 
first, the emphasis on the individual degree of freedom test, the description of Duncan’s 
new multiple range test and the power to be gained by their use over the F test 
with k-1 degrees of freedom; and, second, the argument in the case of analysis of co- 
variance from the general test to the special test commonly used. The notation becomes 
as Li notes, somewhat difficult for the beginner, but even if he loses his way, he is 
likely, because of his attempt to understand, to appreciate the need for caution 
implied in Li’s assertion that “The variance of the difference between two adjusted 
means has two different expressions depending on whether the B’s are equal.” Lastly 
Li was well-advised to restrict the number of tests presented in the chapter on distri- 
bution-free tests to a few for which correspondences could be drawn with the rest of 
the material. It is an example of the sort of judgment that has produced this well- 
integrated book. 


A Course in Multivariate Analysis. M. G. Kendall. Griffin’s Statistical Monographs and 
Courses. New York: Number Two of Hafner Publishing Co., 1957. Pp. 185. $4.50. 


Rotr E. Baremann, Virginia Polytechnic Institute 


= seems to be a time for multivariate analysis to come of age. T. W. Anderson’s 
book on the subject appeared last fall; S. N. Roy’s book is now being distributed 
in the United States; and, finally, M. G. Kendall decided to present the notes of his 
1954/55 lectures on multivariate analysis in printed form. 
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One remarkable feature of Kendall’s Course is its mode of presentation. The “class- 
ical” system of teaching multivariate analysis—multivariate normal distribution, 
Wishart’s distribution, Hotelling’s T?, distribution of characteristic roots and canon- 
ical correlation and, if time permitted, some mention of Wilks’ and Bartlett’s 
techniques—has been abandoned in favor of a functional approach. Kendall presents 
his material under the headings, “Component Analysis”, “Factor Analysis”, “Func- 
tional Relationship”, “Canonical Analysis”, etc. From the applied statistician’s point 
of view, this presentation is certainly helpful. Whatever may be said in favor of or 
against the kind of illustrations which Kendall uses to demonstrate the techniques, 
the most remarkable fact is that he spent considerable effort in identifying areas of 
actual and potential applications. 

A commendable feature of the book is the separation of component analysis and 
factor analysis, even though the most essential difference is not clearly stated, viz., 
that the former is a kind of decomposition of variance wi.ereas the latter is a classifica- 
tion of partial correlation matrices, a fact which was known to Sir Godfrey Thomson 
as early as 1934. It is deplorable that the controversies in these two areas, which 
were more frequently prompted 7 personal motives and ignorance than by objective 
reasons, are still discernible in Kendall’s new book, though not as strongly as in one 
of his earlier papers (with Babington Smith in the Journal of the Royal Statistical 
Society 1950). He speaks approvingly of Stone’s interpretation of principal com- 
ponents in economics, in which Kendall is a specialist, and disapprovingly (“. . . The 
general intelligence factor is a notorious case” ...p. 27) of an interpretation in 
psychology, the latter statement being aggravated by the fact that he is referring to 
factor analysis under the heading of component analysis, thereby partly eliminating 
his commendable distinction between the two. The question whether principal axes, 
or some transform like simple structure, are amenable to physical interpretation is as 
futile for the statistician as the question whether the coefficients of orthogonal poly- 
nomials or straight polynomials are amenable to such interpretation. Factor analysis, 
not component analysis, presents a very striking analogy to the problem of curve fit- 
ting by polynomials, and is infested with the same kind of controversial problems, 
though they are not stated with as much belligerence in the latter. Kendall “objects” 
to the centroid method (p. 28); one wonders whether he also “objects” to the variate- 
difference method, a powerful and easily obtainable approximation in a different area. 
His objection becomes all the more questionable since in Exercise No. 7 (p. 182) he 
suggests a centroid analysis on the identity matrix. He seems to overlook the fact that 
no person with any experience in this area would use a centroid in lieu of eigenvectors 
in the relatively easy problem of obtaining principal components, except, perhaps, as 
a starting point for some more powerful iterative techniques than the ones mentioned 
on pp. 24 and 29. It is only in the much more difficult case of obtaining the maximum- 
likelihood solution of factor analysis (Lawley) that one occasionally substitutes the 
centroid because the computational labor of the former is easily ten times as hard as 
that of the latter. 

The treatment of communalities (pp. 43 to 45) is misleading. The author seems to 
forget that “communalities” are unique quantities for a given correlation matrix 
which can be obtained if we spend enough time on computations, and that we observe 
differences only when we get tired of the excessive computations and become satisfied 
with a more or less close approximation. The statement of the Ledermann equation 
(3.16 on p. 43) is erroneous. It should state that, in order to reduce the rank of a cor- 
relation matrix below m given by (3.16), we must have certain equalities among the 
elements of the matrix. A reduction to rank m still presupposes that the elements 
satisfy certain inequalities; in general we cannot carry the reduction to a rank less 
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than p—1. This is one of the earliest misunderstandings in factor analysis and has 
been dealt with extensively in the literature. 

The chapter on Canonical Ana/ysis elaborates on the distinction between regres- 
sion and correlation, a source of so much confusion in the past that the difference can- 
not be overemphasized. We somewhat miss a reference to the fact that the largest 
canonical correlation in the sample is a statistic used for testing the hypothesis of 
independence between two sets of random variables, with a well-known and, by this 
time, well tabulated distribution function for the central case (Pillai 1957, Heck 
1958), so that Bartlett’s technique, based on likelihood-ratio statistics (see p. 97) 
can now be replaced by a more readily available one. 

The Historical Notes (pp. 105 to 116) contain some interesting and some less weil 
known references to the literature. It would be helpful to find a distinction between 
methods which have found considerable practical justification and those which are 
primarily mathematical speculations. The uninitiated reader would profit from the 
knowledge that the Wilks-Bartlett system, with Hotelling’s T? as a special case, con- 
stitutes an extremely powerful tool, whereas the variety of distance functions (pp. 112 
to 116) constitutes but a small sample of possible approaches to these problems. The 
omission of Roy’s system in this chapter seems to indicate a rather one-sided pres- 
entation of the literature. 

On the whole we must concede that Kendall’s Course is a valuable contribution in 
that it proposes a new approach to the teaching of multivariate analysis. A course 
based upon the chapters on multivariate analysis in Rao’s book, the chapter in Ken- 
dall’s Advanced Theory of Statistics, somewhat modernized and extended by references 
to Anderson’s and Roy’s treatises, and presented in the sequence proposed by Ken- 
dall’s Course would indeed be the best we can offer in the complex area of multi- 
variate analysis today. I am sure that Kenda!l was able to base a remarkable set of 
lectures on the notes contained in his Course. A less experienced instructor, however, 
would find it difficult to offer a valuable course on the basis of these notes alone. 


Scientific Inference. Second Edition. Harold Jeffreys. New York: Cambridge University 
Press, 1957. Pp. viii, 236. $4.75. 


Patrick Supres, Stanford University 


HIs second edition of Jeffreys’ well-known book is to be heartily recommended. 

Since the first edition appeared many years ago (1931), I will indicate the breadth 
of the book by giving the chapter headings: I Logic and scientific inference. II Proba- 
bility. III Sampling. IV Errors. V Physical magnitudes. VI Mensuration. VII Newto- 
nian dynamics. VIII Light and relativity. IX Miscellaneous questions. X Statistical 
mechanics and quantum theory. The last chapter is new to this edition; the remainder 
of the book has been thoroughly rewritten. But the central focus of the book is un- 
changed. It is a vigorous defense of the author’s conception of scientific method as 
based on his own variant of subjective probability, which he calls epistemological 
probability. The first four chapters are particularly concerned to show the central role 
of probability in scientific inference. Readers of this journal need scarcely be reminded 
of the necessity toconsider problems of sampling and of error in any account of scientif- 
ic method which is not hoplessly naive, but Jeffreys’ emphasis on these topics in the 
third and fourth chapters is welcome. 

Although I think this book will be rewarding for those interested in serious analyses 
of the fundamentals of scientific method, and although I commend the author for his 
willingness and ability to grapple with the kind of problems which arise in actual 
scientific practice, there are two critical comments of a general nature and several 
more specific ones which I want to make. Considering the foundational character of 
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the book, Jeffreys does not adequately defend his conception of probability, which is 
similar to Keynes’, against the other major contenders in the field. The relative 
frequency theory is given a cursory discussion (pp. 181-5), and the work of B. de 
Finetti, B. O. Koopman, and L. J. Savage is not mentioned in any respect. English 
contributions, in contrast, are given notice out of proportion to their importance. 

My second general comment is that almost none of the basic concepts or principles 
which are considered central to scientific inference are formulated in a satisfactorily 
exact manner. This is not because the book is elementary in character; a knowledge 
of physics and a considerable degree of mathematical sophistication are required 
for a full reading of all chapters. Some examples of this essential lack of precision may 
be given. The axioms for probability in Chap. II are not sufficient to guarantee numer- 
ical assignment of probabilities, although the assumption of such an assignment is 
crucia] to later chapters. Much is made of the postulate of simplicity for selecting 
among hypotheses, and doubtless some vague ideas on this subject are daily heuris- 
tically helpful to working scientists, but the definition of simplicity that is given in 
terms of the number of free parameters (p. 39) obviously is not adequate for a sys- 
tematic theory of simplicity. It is characteristic that the rather sizable logical litera- 
ture on the subject of simplicity is not mentioned. What is important is that this 
literature shows how extraordinarily difficult it is to come by an adequate definition. 

The chapter on mensuration has an interesting development of the theory of 
measuring distance and the way in which geometry may be based on it, but the 
chapter is lacking in the postulational precision one has come to expect in con- 
temporary discussions of the foundations of measurement or geometry. Next to the 
science of measurement of distance, the most exact empirical science is classical New- 
tonian dynamics, to which the author turns in Chap. VII. Much of the theory of mo- 
tion of bodies acted on only by gravitational forces is developed. What is lacking is a 
precise formulation of the fundamental assumptions of dynamics as has been attempted 
by Hertz, Hamel, and others. The aim of the chapter seems to be to show how dynam- 
ies arose or could have arisen from observation of the solar system, yet there is no 
attempt at historical accuracy. 

Chaps. VIII and X are full of interesting comments about modern physics. But 
Chap. IX consists of unworked-out obiter dicta on more than a dozen topics, which 
range from psychoanalysis to the philosophical doctrine of phenomenalism. 


Moderne Methoden in der Agrarstatistik. Heinrich Strecker. Einzelschriften der 
Deutschen Statistischen Gesellschaft Nr. 8. Wuerzburg, Germany: Physica-Verlag, 1957. 
Pp. 141, 18 tables, 1 map. D.M. 17.50. Paper. 


Geruarp TintNER, Jowa State College 


His interesting little book deals with modern statistical methods, especially sample 

surveys, as applied to problems of Western German agriculture. It is written in 
German, but it contains an excellent English summary and should be accessible to 
English-speaking statisticians with a little knowledge of German. Applied statisticians 
who are interested in sample surveys, agricultural economists who concern themselves 
with the problems of Western European agriculture, and economic statisticians in 
general will profit from reading this book. 

The author is a lecturer (privatdozent) in statistics at the University of Munich. 
He is a student of the well-known theoretical statistician, Oskar Anderson of the 
University of Munich. He starts out by giving a short historical survey of sampling 
methods. He is thoroughly familar with the international literature on the subject, 
especially the books and articles written in English. 

There was a census of agriculture in Western Germany in 1949. But preliminary 
results were obtained quickly by stratified sampling, with three strata. The results 
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were encouraging; in spite of the fact that the sample comprised only 2% of all 
agricultural enterprises, 9/10 of the deviations from the complete enumeration in the 
census were less than 10%. 

Continuous stratified sampling has been used on a half-yearly basis since 1952 in an 
_effort to estimate the German farm labor force. The sample comprises 8% of the total 
number of farms. A four-stage sampling process was used for crop statistics. Simi- 
larly, a four-stage sampling process gives estimates of fruit yields. 

A livestock sample census used two strata in 1950, the Bavarian census of 1952 
four strata. For all Western Germany, the sampling average was about 12%. Three 
strata were used in a sample survey of milk production. A land-use survey in 1952 . 
utilized about 3.5% of all communities. It is encouraging to learn that the most 
modern methods of sampling have been used successfully in Western Germany since 
the last war. 

There is also a very complete bibliography and a name and subject index. On the 
whole, it can be said, this book maintains the high level of the previous contribution 
of H. Kellerer to the theory of the subject Theorie und Technik des Stichprobenver- 
fahrens (2nd ed. Physica-Verlag, Wuerzburg, 1953). May we expect in the near future 
additional significant contributions by German scholars to the theory and practical 
application of statistics? 


Concise Tables for Statisticians. K. C. Sreedharan Pillai. Manila: The Statistical Center, 
University of the Philippines, 1957. Pp. 50. 3.00 Ph. pesos. 


His small book of tables contains several new tables for multivariate test criteria 

as well as tables in common use. Included are the ordinates and areas of the 
normal curve, the z transformation of the correlation coefficient, 10,000 random digits 
and quantiles for the distributions of t, x*, F range, one- and two-sample range ana- 
logs of t-statistics, range-midrange test statistic, ratio of two ranges, studentized 
range, studentized extreme deviate from the population mean, studentized maximum 
modulus, correlation coefficient (under independence) and the largest characteristic 
root and the sum of the characteristic roots of a matrix. At least the 5% and 1% 
level quantiles are given, and for all nonsymmetric distributions except the last two, 
upper and lower quantiles are given. The new tables give the upper 5% and 1% 
quantiles for the largest root or the sum of the roots of a matrix, whenever the roots 
are distributed in the Fisher-Hsu-Roy distribution and the number of nonzero roots 
is 2, 3, 4 or 5 for the largest root tables, and 2, 3, or 4 for the sum tables. A brief 
introduction gives some examples of use of the new tables. 

D. L Wattace 


Statistical Yearbook, 1957. Ninth Issue. Statistical Office of the United Nations. New York: 
Columbia University Press, 1958. Pp. 674. Paper, $6.50; cloth, $8.00. 


HE ninth issue of the Statistical Yearbook, covering 1957, prepared as before by the 
Statistical Office of the United Nations, received more and wider co-operation in 
preparation than before. For the first time a completed questionnaire was received 
from the German Democratic Republic (“East Germany”) and for the first time in 
recent years from Czechoslovakia. “As a result, it has been possible to give in the 
present Yearbook a wide range of statistics for the USSR and all the Eastern European 
countries except Albania, thus largely filling in an important but unavoidable gap in 
the world statistical picture presented in earlier issues of the Yearbook. ... Due to 
the inclusion of a new chapter on international economic aid and several new tables, 
the present Yearbook contains more tables (191) than any of the preceding issues.” 
The new chapter on “International Economic Aid” consists of three tables relating 
to the period 1954-1956. They “show, in U. S. dollars, grants and loans furnished to 
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under-developed countries by individual governments and international govern- 
mental agencies, the total aid received related to population and per capita gross 
national product, and the contributions of governments to international technical 
assistance, relief and lending agencies.” Three new tables have been added to this 
chapter, ocean freight rates; discount rates of central banks; and books: translations.. 
Data on man-made noncellulosie fibers have been expanded to form a new table, 
and the table on social security schemes omitted last year has been reinstated. The 
Yearbook has dropped tables on railway rolling stock and on wholesale prices of 
selected commodities. 

This year’s volume contains 28 more pages than last year’s and has increased in 
price for both paper and cloth bindings. As usual, it is in both English and French. 

D. D. F. 


Statistical Year Book. Quebec 1956-57. Bureau of Statistics, J. C. McGee, Director. Quebec: 
Minister of Trade and Commerce, 1958. Pp. xx, 609. $2.00. Printed in French and English. 


he 1956-57 edition of the Statistical Year Book is the fortieth to be published 
by the Quebec Bureau of Statistics.” The Preface states: “The Province of Quebec 
is in full economic expansion and to meet new circumstances far-reaching changes 
are taking place.” In Quebec’s agricultural economy the creation of a marketing 
board is the most important event of recent years. The chapter on Agriculture gives 
the history and work of this body. In the field of natural resources, hydro-electric 
power has been greatly increased and mining operations are creating a steadily in- 
creasing demand for it, so the present edition contains a special article about it. 
Information on Construction has been put in a chapter separate from Manufactures, 
with values, contracts, and building permits included. This volume gives the returns 


of the Quinquennial Census taken in June 1956, with data from it added to chapters 
on Population and Agriculture. 


D. D. F. 


Games and Decisions: Introduction and Critical Survey. R. Duncan Luce and Howard 
Raiffa. New York: John Wiley and Sons, Inc.; London: Chapman and Hall, Limited, 
1957. Pp. xix, 509. $8.75. 


Irwin Mann, The RAND Corportaion 


_— the last dozen vears there have been many papers but few books on the 
theory of games. The first and monumental book, which stimulated most activity 
in the subject, is the Theory of Games and Economic Behavior, by Von Neumann and 
Morgenstern, which appeared in 1944. The theory has come to have vitality espe- 
cially for mathematics, though also for statistics (Wald’s maximin utility criterion is 
strongly influenced by it), and it has some bearing in the behavioral sciences, military 
strategy, operations research, and the philosophy of natural science. 

“Game” is of course a generic name for any situation in which there are the equiva- 
lent of persons (“players”) who have choices (“rules”) which taken together determine 
outcomes. Of course, the players have individual preferences among the outcomes, 
but only partial individual control over them. The problem is to describe a “best” 
way for a person to “play”. Actually in order to have a game the players should 
know completely the rules and the utilities of the outcomes for each of the players. 
But sometimes decisions must be made in the face of uncertainty about these. In some 
situations, it may even be suggestive to think of nature as an adversary. 

There are several ways of classifying games. One of the most important is that they 
are either zero-sum (a term indicating that the sum of “winnings” is equal to the sum 
of “losings”) or non-zero-sum. Non-zero-sum games are easy to manufacture mathe- 
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matically, and it is commonly recognized that real-world games such as take place in 
the marketplace, in love and in war, are likely to be non-zero-sum. Another classifica- 
tion is that of two-person as against n-person games. The theory in the n-person case 
is much more difficult, and real-world games are likely to be of that variety. 

The subject does not require much introduction or development. The theory then 
gives rise to whole new classes of “mathematical” games which have interest for their 
simplicity, generality, or nature of solutions. It is characteristic of the theory that 
though these games are fascinating to mathematicians, they might be downright dull 
or too costly in the parlor, and in fact, the complementary situation is also true. For 
example, chess is a game which is already determined in favor of white or black or a 
draw. Whether or not an adequate concept for “solution” exists is a crucial issue in 
the theory of games. Solutions have been found for certain classes of games while for 
other classes there are still open questions. Another central issue is the determining 
by some kind of calculation the solution to a game once it is known that there is one. 
Games and Decisions however is concerned primarily with the first notion, that of 
understanding what can be said about rational behavior in the context of a game. 

The book is an impressive critical survey (that grew out of the authors’ own review 
of the subject) of many ideas having the title as a central theme. It is beautifully 
organized and scholarly, yet warm, informal, and introspective. Even though the 
authors are trained as mathematicians, they have a broad interest in the social 
sciences, and very few other mathematicians could have written the book. Though 
well aware of the values of rigour, they have attempted to minimize the mathematical 
details though a small and necessary portion is written in algebra. The main interest 
is centered in relevant concepts, and in determining their strength of weakness. 

However, the mathematician may profit from what mathematics there is, and he 
will naturally be sensitive to it. There are even some open questions, though they are 
of course hard (as some problems simple to state are deep). Most of the complicated 
mathematical results are skipped. But it should be mentioned that the authors do 
not really coddle the reader, and in style seem to cut through many levels of detail 
(even the index is brief). There is an excellent bibliography. 

The book announces its intention of surveying the problem of individual decision 
making, usually in a social context. The decision may be made under certainty, risk, 
or uncertainty. Risk of course here means certainty up to a known probability dis- 
tribution over the states of nature. Games fall in the category of decision making 
under what is a combination of risk and uncertainty, though if there is a “reliable” 
solution of the game, then there is only risk. 

It is probably fairest to outline the book by listing the chapter titles: (1) general 
introduction to the theory of games, (2) utility theory, (3) extensive and normal forms, 
(4) two-person zero-sum games, (5) two-person non-zero-sum non-cooperative games, 
(6) two-person cooperative games, (7) theories of n-person games in normal form, (8) 
characteristic functions, (9) solutions, (10) ¥-stability [this theory is due to Luce], (11) 
reasonable outcomes and value, (12) applications of n-person theory, (13) individual 
decision making under certainty, and (14) group decision making. There is also a re- 
view of related technical topics which is valuable and relegated to eight appendices 
(114 pages). They include: (1) a probabilistic theory of utility [due to Luce], (2) 
the minimax theorem, (3) and (4) two geometrical interpretations of a two-person 
zero-sum game (5) linear programming and two-person zero-sum games, (6) solving 
two-person zero-sum games, (7) games with infinite pure strategy sets, and (8) sequen- 
tial compounding of two-person games, 

In order to give some feeling for the material of the book, it might be best simply 
to summarize two representative chapters. Chapter 6 on Two-Person Cooperative 
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Games treats the two-persoa non-zero-sum game where cooperation and communica- 
tion between players is permitted. The negotiation set is introduced and consists of 
all undominated payoffs for which each player gets at least his maximin value. Von 
Neumann and Morgenstern called this set itself the solution, feeling that mathe- 
matics alone could not contribute more. 

The rest of the chapter is concerned with possible reasonable arbitration schemes. 
The work on such abstract schemes is due partly to Raiffa. Several properties which 
such a scheme seemingly should have are mentioned, but the reader is cautioned that 
their reasonableness should not be assessed too firmly in an abstract setting. The 
Nash solution to the bargaining problem, and his extension of it to the general game, 
are brought out. Four assumptions are enough to provide a unique solution to the 
simple bargaining problem. The negotiation scheme in the extended bargaining 
model can be made to rest on seven axioms, at least two of which the authors feel, 
are hard to justify. However they feel the modei and the axioms complement each 
other very well. The Shapley value from the theory of n-person games is introduced 
ahead of time, and applied to the two-person case. Finally, the possibility of inter- 
personal comparisons of utility is considered. If this can be done initially, then the 
game is reduced to one of relative advantage. However, in most cases, such inter- 
personal comparisons are not at all easy. Two methods are given, with an example, 
which use the specific game to suggest ways in which the units of utility can be 
changed to establish comparisons. The chapter closes with some discussion of the 
continuity of arbitrated solutions with changes in the utility scales. 

Chapter 13 on Individual Decision Making Under Uncertainty begins with a 
formulation of the problem where a matrix of utilities is given for the consequences of 
acts under different states of nature. If one knows an a priori probability distribution 
for the states of nature, his problem is reduced to one of decision making under risk. 
The residual problem is: what constitute reasonable criteria for making an optimal 
decision in the face of uncertainty? Criticisms of several criteria are analyzed. These 
include maximin utility or minimax loss, minimax risk or regret, the pessimism- 
optimism index, and the principle of insufficient reason. Having done this the authors 
delve into the desiderata which a respectable decision criterion ought to fulfill. These 
axioms are carefully inspected for their consequences to the criteria. 

The chapter turns to statistical decision making. The question is now: how should 
partial knowledge about the states of nature, gleaned by experiment, be processed? 
Savage’s theory of a subjective probability measure such that expected utilities cor- 
rectly reflect preferences is restated. If an a priori distribution over the states of 
nature can be found, then experimentation only alters it, using the Bayes relations. 
If such a distribution is aot known or assumed, then the analysis is a little more 
complicated. By choosing among decision rules instead of among terminal acts, and 
associating with each experiment an act, the decision rule-state of nature pair has a 
well defined lottery of consequences, and this can be evaluated by its expected utility. 
This abstractly reduces the statistical decision problem to the previously considered 
problem of decision making under uncertainty. Lastly, some of the classical problms 
of statistical inference (testing hypotheses, point estimation, and confidence interval 
estimation) are indicated briefly. It is pointed out that modern, unlike classical, 
decision theory formally tries to take account of all the consequences which are 
attributable to actual terminal acts. 

The authors understand that the applications of this theory are not so clear. Just 
as chess is an indealized version of war, so to some much lesser, though not in the sense 
different, extent are “abstract” games idealized versions of “real” games. When 
the real world is somehow modelized, and no essential aspects of the situation have 
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been abstracted away, then the theory of decision and its in:plications may come into 
play. And in any case, the theory is alive and will appeal to those who have a love of 
the orderly and the abstract, and the unexpected richness of mathematical concepts. 


Linear Programming and Economic Analysis. The Rand Series. Robert Dorfman, Paul A. 
yy ry Robert M. Solow. New York: McGraw-Hill Book Company, 1958. Pp. 
ix, 527. $10.00. 


Francis W. Drescn, Stanford Research Institute 


hoe book is addressed to the economist who knows economic theory and enough 
mathematics for marginal analysis or the theory of imperfect competition but is 
not an accomplished mathematician. The book is designed to give such a reader a 
broad introduction to the theory of linear programming. The authors have succeeded 
remarkably well in this, their primary objective, although they forget their nonmathe- 
matical accent on occasion. Four references to the Euler theorem on homogeneous 
functions, for example, fail to give any real clue as to what it is. 

The expressed hope that this book might also provide “mathematicians interested 
in programming problems with insights into the vast body of modern economic 
theory” is probably overoptimistic. To have met this secondary objective adequately 
would have required a prohibitive expansion of the background economic material 
alluded to throughout the text. On the other hand, the reader familiar with matrix 
algrebra may find his patience strained by lengthy discussions of numerical examples, 
or successions of special cases of decreasing simplicity, and tend to rely more on the 
excellent mathematical footnotes than on the text. 

Production functions and valuation are related to linear programming and the 
duality theorems by the shadow price concept. The analysis is applied to the theory 
of the firm, the general economic equilibrium, efficient programs of capital accumula- 
tion, and welfare economics. There are two chapters on static Leontief input-output 
models and a chapter on nonlinear programming. Two chapters discuss game theory 
and its relation to linear programming. Appendixes on the probabilistic approach to 
utility theory and on matrix algebra are concise but very readable. The whole book 
is an excellent example of thoroughly motivated exposition achieved without sacrifice 
of precision or accuracy. 

Although the coverage is very satisfactory for the economist, or even for the man- 
agement scientist, the statistician may regret with the authors that space limitations 
forced them to omit any discussion of the role of linear programming in statistical 
decision theory. 


Queues, Inventories and Maintenance, Publications in Operations Research No. 1. 
Philip M. Morse. New York: John Wiley and Sons, 1958. Pp. ix, 202. $6.50. 


Karun, Stanford University 


LTHOUGH studies of queueing (waiting time) models began some 45 years ago with 
Erlang, the rigorous formulation of the probability structure of these queueing 
processes is relatively recent. 

Queueing situations arise in a variety of forms, as, for example: landing of aircraft, 
loading and unloading of cargo ships, scheduling of patients in clinics, telephone wait- 
ing times, the servicing of knitting machines, ete. A queueing model has three char- 
acteristics: (1) an input process which describes the nature of the arrival of customers, 
(2) a queue discipline determining the order of service, (3) the service mechanism 
which indicates how many servers are available and the nature of the service time 
distribution. 
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Some of the elements of interest are: the waiting time of the customer, the number 
of customers in the queue at a given time, and the length of the over-all busy period of 
a server. All of these quantities are usually random variables whose distributions 
are to be determined. 

Queues, Inventories, and Maintenance is the first collection of applications of 
queueing theory which exploits some of the recent sophisticated mathematical devel- 
opments for the solution of special models. The bulk of the discussion (the first 9 of 
the 11 chapters) is devoted in the main to the tabulation of the queueing characteris- 
tics of numerous special cases. Chap. 10 considers some simple illustrations of in- 
ventory phenomena which may be interpreted in terms of queueing. The final chapter 
examines a machine repair problem whose breakdown mechanism resembles a queue- 
ing process. Here, the length of time for repair corresponds to service time. 

The book is directed primarily toward the operation researcher and much of the dis- 
cussion is justifiably motivated by practical considerations. For example, balancing 
service costs versus customers lost due to customer impatience raises a general problem 
which underlies much of the discussion. 

The first chapter discusses the classical, but important, queueing model of ex- 
ponential input and exponential service. Here, steady state probabilities of waiting 
time, length of line, etc. are developed. The following chapters extend these results 
to the case where service time is a gamma distribution of integral order and the in- 
put process is a finite superposition of Poisson processes. To get explicit formulas, 
elaborate algebraic and analytic manipulations are carried out using generating func- 
tion technique and implicitly the method of the embedded Markov chain initiated 
by David Kendall. In most respects the cases examined are the simplest and need not 
appeal to the extensive theory of Markov processes. The author himself admits the 
simplicity of the model and purposely confines himself to these in order to better 
achieve the evaluation of a variety of operational questions. As a result, he was 
compelled, unfortunately, to leave out the important mathematical tools of the 
Wiener-Hopf technique, basic to the analysis of the general equilibirium theory. It 
would also have been valuable to point out connections of queueing theory to other 
models in stuchastic processes, notably birth and death processes, renewal processes, 
and counter processes. 

Part of Chap. 6 is devoted to an elementary discussion of the time-dependent queue- 
ing case. It is unfortunate that the elegant work of Takd&es on the transient queueing 
process is not exploited. Some results of the multiple server steady state exponential 
queueing process are summarized in Chap. 8. All of the first eight chapters concentrate 
on the case of the queue discipline with first come, first serve. In contrast, Chap. 9 
considers a queueing model with two types of inputs whose operation is subject to a 
priority queue discipline. Here is a situation of a simple server having to deal with 
priority items and nonpriority items, such that priority items displace in service 
nonpriority items. The author develops expressions for the expected waiting time 
and the distribution of length of line in the case where the input process is Poisson 
and the service distribution is exponential. The last two chapters, concerned with 
applying queueing concepts to inventory and maintenance problems, are interesting 
but preliminary. 

The book is very well organized and focuses quickly on interesting applications. 
Unfortunately, the high frequency of misprints makes difficult reading. Nevertheless, 
this book is an invaluable compilation of distribution theory for the characteristics 
of a large number of useful queueing models. Insights, applications, and interpreta- 
tions of the theoretical queueing results related to operational problems of waiting 
time, maintenance, and inventory will serve for many years to come as the reference 
for all future studies of problems of this kind. 
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Toeplitz Forms and Their Applications. Ulf Grenander and Gabor Szegé. Berkeley and Los 
Angeles: University of California Press, 1958. Pp. vii, 245. $6.00. 


Leonarp J. SavaGe, University of Chicago 


HIs is a by no means popular monograph in a topic in pure mathematics. It is 

mentioned here because of its relevance for serious students of (weakly) stationary - 
stochastic processes. A sequence of Toeplitz forms 7, (in terms sufficiently general 
for this brief review) is a sequence of quadratic functions T7,=).C,—:usue (where 
8,t=1, +--+, m), especially when the C; can in some sense be thought of as Fourier co- 
efficients. In the application to a time discrete stochastic process X;, C; would be the 
covariance of Xo with X,;. There are of course generalizations to time-continuous 
processes and to complex valued processes, 

Beginning with Toeplitz in 1910, there has been a ramified and subtle analytical 
theory of Toeplitz forms in which Szegé has been one of the principal workers. Inter- 
est and progress in the theory have reblossomed in the last decade or so mainly in 
consequence of the pioneering work of A. N. Kolmogoroff and N. Wiener on linear 
prediction and filtering. 

The book is carefully but drily written. As often happens with mathematics books, 
the background required to read it is nominally quite small but actually very consider- 
able. The book will be of direct use for research and advanced study in an important 
area and promises thus to benefit the whole statistical community. 


Soviet Economic Growth: A Comparison with the United States, a study prepared for the 
Subcommittee on Foreign Economic Policy of the Joint Economic Committee. Legislative 
Reference Service of the Library of Congress. Washington: Government Printing Office, 
1957. Pp. xi, 149. $0.40. Paper. 


G. WaRrREN Notter, University of Virginia 


HIs is a work of topical as opposed to lasting importance. By origin, its primary 
aim is to inform legislators and to mold public opinion on issues involved in the 
present struggle between Russian Communism and the West. In the main and by 
design, the broad views presented are probably those of the majority of American 
specialists in Soviet economic affairs. Significant dissenting views appear here and 
there in footnotes, but they are not reflected in statements of summary and conclusion. 
The report covers industry, transportation, agriculture, population and man- 
power, standards of living, and national income. In each case there is a summary of 
Soviet growth since 1928, a comparison with selected periods in the United States, 
and an attempted prophecy of the near future. Every specialist will find something to 
quarrel with somewhere, and most will be generally unhappy with explanations of 
sources of data and methods of estimation. The primary areas of controversy are, 
however, industry and national income, and I wish to direct my remarks to them. 
The section on industry concludes with this statement (p. 47): “The above exami- 
nation of the factors influencing Soviet industrial output suggests that the rate of 
growth of Soviet industrial output during the next 5 to 10 years will slow down per- 
ceptibily, but it does not suggest that the slowdown will be sharp enough necessarily 
to cause the rate of increase to fall to the current level in the United States (near 
44 per cent per annum) or even to a rate much below about 7 or 8 per cent per annum 
on the average.” Like all prophecies, this one is so stated that it can hardly be dis- 
proved no matter what happens. Nevertheless, a specific number—7 or 8 per cent— 
is introduced as, the hedging ignored, a bottom for the rate over the next decade. 
This is interesting to me since my own work indicates that the average annual rate of 
growth, calculated from Western-type production indexes, did not exceed that figure 
for 1950-1955 and was even lower for 1955-1957. How may we explain this very 
substantial disagreement? 
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In my opinion, the past growth rates for Soviet industrial production given in the 
report (p. 24) are significantly exaggerated. It is hardly possible to trace the sources 
of exaggeration since the report’s estimates are made by a roundabout procedure of 
adjusting other defective indexes. In any event, the report’s and my estimates of 
average annual growth rates are as follows: for 1928-1955, 7.7 per cent and 6.1 per 
cent (both adjusted to remove gains from territorial expansion); and for 1950-1955, 
9.9 per cent and 7.7 per cent. Needless to say, differences of this magnitude can mate- 
rially affect conclusions about the past and hence the future. They also suggest that 
further study is desirable to find out which estimates are more reasonable. 

The report is careful to warn that Soviet data on industrial output are markedly 
less reliable than American data (see, for instance, pp. 8-9, 20-21, 26, and 34). At 
some crucial points, however, these warnings are forgotten. For example, it is rather 
clearly implied (p. 21) that the industrial production index covering the last several 
decades of the nineteenth century for the United States is no more reliable than in- 
dexes covering recent periods for the Soviet Union. The argument is based on the 
poor product coverage of the American index used by the report, namely, the Persons 
index. This is in the text. A footnote explains that an index with much broader cover- 
age—the Frickey index—gives virtually the same results as the Persons index. 

In a number of other respects doubts are resolved in favor of the Soviet Union. 
Short American periods with rapid rates of industrial growth are almost all ruled out 
of comparison with Soviet performance of 1950-1955 for a variety of reasons making 
them “abnormal” (see pp. 22-23 and 24-25). Study of Soviet lags behind the United 
States in output of individual products is described (last paragraph, p. 26) as being 
irrelevant as far as comparing percentage growth rates is concerned and misleading 
for all important purposes. The first criticism is clearly false: if in 1913 Soviet produc- 
tion of steel was 21 years behind American production and in 1955, 29 years behind, 
then growth from the same starting point was more rapid percentagewise in the United 
States than in the Soviet Union. The second criticism is used by the report to explain 
why it does not publish such data. 

A critical shortcoming of the report is its failure to relate Soviet growth to pre- 
revolutionary conditions. I could find only two passing references to data for 1913, 
one in the case of transportation (pp. 48-49) and the other in the case of population 
(p. 82). The omission is deliberate, being explicitly justified on the ground that pub- 
lication of data for the prerevolutionary period would only mislead the reader (see 
p. 6). Perhaps so, but this depends on two things: first, what the reader is interested 
in studying; and second, how intelligent he is. The report is clearly preoccupied with 
forecasting relative Soviet and American growth over the next few years, and for 
this purpose there may be some danger in extrapolating a trend based in part on a 
rather remote period of growth. In my view, there is at least equal danger in ignoring 
the long period. More importart is the point that there are other urgent reasons for 
studying Soviet growth. For one thing, we need to set the record straight as to just 
what has been accomplished in Russia since the Bolshevik revolution. Aims of this 
sort are not furthered by leaving out the story of prerevolutionary conditions and 
rates of growth. 

For national income, the faults of the report are on a different level. After reading 
persistent warnings about the unreliability of data in the industry section, one ex- 
pects to find a careful discussion of the enormous shortcomings of estimates of Soviet 
national income. I found only one brief comment on problems of estimating Soviet 
national income in current prices, and that in a footnote dealing with a highly tech- 
nical adjustment of prices to a “factor cost basis” (p. 130). The reader is left to assume 
that estimates of Soviet national income, in the aggregate and by category, are as reli- 
able as estimates of American national income, even though the Soviet figures for 
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1950 and 1955 are explained only by the following statement: “Estimated by staff 
following [the] same pattern as preceding [sources] in consultation with specialists 
experienced with these data” (p. 128). The report does state that there are no reliable 
estimates of Soviet rea? national income (p. 135), but this is after it has stated (p. 132, 
repeated on pp. 134 and 143) that “the United States may be producing currently 
about three times as many goods and services as the Soviet Union.” The source for 
this key figure is a news magazine quoting Allen Dulles. 

All this considered, I doubt that this report performs the job it was intended to do: 
provide the relevant facts that a responsible public needs in order to make intelligent 
decisions on issues in the current East-West conflict. 


Farm Housing. Glenn H. Beyer and J. Hugh Rose. New York: John Wiley & Sons, Inc., 
1957. Pp. xi, 194. $6.00. ‘ 

Housing: A Factual Analysis. Glenn H. Beyer. New York: The Macmillan Company, 1958. 
Pp. xxvi, 335. $6.75. 


Ricuarp F. Resources for the Future 


HESE two volumes have much in common in addition to authorship and subject 

matter. Both are primarily descriptive, adhere to a high level of tabular presenta- 
tion of data, and contain little statistical inference or economic analysis. The first is a 
research monograph which describes data relating to farm housing contained in recent 
censuses, while the second might be characterized as an introduction to the subject 
of housing. 

Farm Housing is another volume in the census monograph series sponsored by the 
Bureau of the Census and the Social Science Research Council. The purpose of this 
series is to make information secured by the censuses available to a wide group of 
persons. The principal purpose of this volume in the series is to describe the farm- 
house and household in the united States in 1950 and changes which occurred during 
the preceding decade. In this the authors perform very well. However, I feel that the 
volume would have been much more useful had they paid more attention to evaluat- 
ing the economic and/or social significance of the characteristics of farm housing 
they examine and to analyzing more precisely the effect of determinants of these 
characteristics, particularly the level of income as a determinant of housing quality. 

Characteristics for the nation as a whole are described in Chapter 2. Over the decade 
1940-50 the number of farm houses declined by about 17 per cent, and the propor- 
tion which were owner-occupied increased from a little over one-half to slightly less 
than two-thirds. More significant, perhaps, is the fact that the evidence indicates that 
the quality of farm housing increased markedly during this period. The per cent of 
farm dwellings with more than one person per room declined from about 30 to 22. 
While the data on condition are not strictly comparable as between the two censuses, 
they indicate a decline in the proportion dilapidated. Also, the proportion of farm 
houses with preferred facilities tended to increase in most cases. The quality of farm 
housing, as revealed by the per cent of dwellings in good condition with all plumbing 
facilities, in relation to income is next examined for 1950, and, not surprisingly, it is 
found that the quality of farm housing varies directly with the income level of the farm 
family. But even for the highest income group, $10,000 and over, only 71 per cent of 
farm houses were of “high quality,” as compared with 97 per cent for non-farm 
familes. The authors feel that this may be partly due to ‘‘conservative habits prev- 
alent among farmers” (p. 30) and partly to price differences—plumbing facilities 
are probably less costly in non-farm as compared with farm areas. I suspect the latter 
is the real reason. Also important is their conclusion that new construction under- 
taken from 1940 to 1950 failed to raise the level of quality of farm housing. 
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In the third chapter a comparison of housing characteristics in twelve type-of-farm- 
ing regions is made. This particular regional classification was probably conditioned 
by the assertion that type of farming has a strong bearing on the type of housing 
occupied by farm families over and above the influence of income. To support this 
hypothesis the authors suggest that dairy farmers are more likely to be influenced by 
urban living patterns than farmers in wheat or grazing areas. Here, too, an alternate 
explanation would be that certain facilities such as indoor plumbing are likely to be 
cheaper in the more urban areas. The latter may account for the finding that, while 
income level seems to explain differences in the proportion of farm houses in good 
condition by region, there is considerable variation by region for a given income class 
in the per cent of farmhouses with all plumbing facilities. The authors recognize the 
importance of cost differences, however, in their discussion of regional differences in 
type of cooking and heating fuel used. But my principal criticism of this chapter is that 
the authors rely too heavily on cross-tabulations of only two characteristics for their 
analysis. Even on a descriptive level it would be very helpful to compute measures of 
partial association between the various characteristics examined rather than to rely 
exclusively on the simple association between various pairs of variables. It would 
have been instructive if the authors had attempted to estimate the regional variation 
remaining after, say, income had been held constant and to ask whether or not this 
residual variability is statistically significant. In addition, the effect of urbanization 
might have been evaluated by including some measure of this variable in a covari- 
ance analysis. 

Elsewhere in this book the authors fail to avail themselves of statistical tech- 
niques that would have facilitated the analysis. Partly because of the cost of tabulat- 
ing for economic subregions data on characteristics published only for counties in 
the 1940 census, the authors wisely decide to employ a sample of subregions for their 
discussion of trends in farm housing characteristics by region. But instead of selecting 
subregions “regarded as typical” for this purpose, as they did, it would seem preferable 
to select a random sample stratified by, say, region and use it as a basis for inferring 
the characteristics for the population of all subregions. Likewise, in their examination 
of the impact of urban centers on farm housing the authors examine averages of cer- 
tain housing characteristics for counties in three concentric rings surrounding 
metropolitan areas. An alternate design would have been to compute the regression 
of a particular characteristic on distance from the metropolitan area using the county 
data as units. In either case some formal significance tests would have proven useful, 
I believe, since in many instances the differences in per cent of farm dwellings having a 
particular characteristic seem quite small, particularly as between dweilings in the 
second and third rings. The chapter on the housing of non-white farm families would 
also have benefited from some multivariate analysis. In explaining the poorer than 
average quality of this kind of housing the authors quite rightly point to the lower 
level of income of non-white families. They also suggest that the high per cent of non- 
white families occupying tenant houses is a factor. But tenants tend to have lower in- 
comes; is there any variation between housing quality of tenant and owner-occupiers 
not accounted for by differences in the level of income? While Farm Housing succeeds 
in providing a lot of useful material, it would have been even more valuable had the 
authors made greater use of statistical techniques in analyzing their data. 

Housing: A Factual Analysis is meant as a description of housing as it exists today 
intended for anyone who might be interested in the subject. Its character is perhaps 
best described by the following words from the preface: “This is not a book on 
theory. .. . Data and statistics are used throughout in order to specifically describe 
the subject matter at hand.” (p. vii). A wide variety of topics is discussed, including 
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factors determining housing demand, characteristics of the stock of housing, the 
housebuilding industry, housing finance, design, urban renewal, and the role of 
government. In general, the author succeeds in providing useful material relating to a 
wide range of topics for the general reader. Text tables are handled well throughout, 
and there are many photographs and cartoons which add to the interest of the book. 
Like Farm Housing this book contains little application of statistical technique be- 
yond tabular presentation. 

However, this book, too, would have been much more helpful in providing an un- 
derstanding of the subject even for the general reader if more analysis of the economic 
and other aspects of housing had been included. In the discussion of housing demand, 
for example, income is mentioned as one of its determinants and several important 
features of income distributions are effectively discussed, but no attempt is made 
to show how the value of housing varies with the level of income. In his treatment 
of housing taxation the author asserts that it is regressive, however, which implies 
that expenditures on housing increase less than proportionally with income. This I 
believe is incorrect, since estimates I have made elsewhere suggest that housing 
expenditures increase at least in proportion to income.’ While the book contains a 
good deal of informative material on housing finance, the interest rate or the cost 
of financing in relation to housing demand is not discussed, except perhaps inciden- 
tally. The price of housing, along with the vacancy ratio and “overcrowding,” is 
treated as influenced by the “balance” of demand against supply, but no discussion 
of the effect of price on quantity of housing demanded is included. Likewise, the dis- 
tinctions between short and long run changes in price and in “overcrowding” are not 
adequately drawn. In the final chapter the author discusses estimates of housing 
“need” and provides some of his own. I find it hard to take such estimates seriously 


because of the impossibility of giving an objective definition of the term “need.” 
But despite these critical comments I would recommend this volume as a readable 
and useful introduction to the subject of housing. 


The Changing Population of the United States. Conrad Taeuber and Irene B. Taeuber. 
New York: John Wiley and Sons, Inc., 1958. Pp. xi, 357. $7.75. 


Ernest Rusin, American University and Howard University 


WENTY-FIVE years have passed since Thompson and Whelpton wrote Population 

Trends in the United States, an important study of demographic changes in this 
country. Five years later, in 1938, the staff of the National Resources Planning Board 
produced the significant monograph, Problems of a Changing Population. The present 
volume by the Teaubers is a worthy successor to the earlier works on population 
changes in this country. 

Their work is one of a series of census monographs jointly sponsored by the Social 
Science Research Council and the Bureau of the Census. The purpose of this study is 
to “. . . provide an overall view of the changing American population.” The authors, 
confronted mainly with an embarrassment of riches, have selected the basic com- 
ponents of population change as reflected in the decennial censuses and related demo- 
graphic sources. They have more than fulfilled their objectives. 

The organization of this work is worthy of comment. The volume consists of four 
major parts, three of which analyze growth, social and economic characteristics, and 
natural increase; in the fourth part the authors state their conclusions as to the future 
development of U. 8S. population. The first part, on growth, describes types of demo- 
graphic changes occurring simultaneously: geographic expansion, coupled with 


1 See my “The Demand for Non-Farm Housing,” (abstract), Econometrica, XXV (April, 1957), 365-66. 
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migration (internal and external) and the rise of urban centers. In the second part, 
qualitative population characteristics of interest to social scientists are carefully 
analyzed, in particular, marital status, family formation, education, economic activity 
and income distribution. In the third part, on, natural increase, the authors examine 
national trends in fertility and mortality. 

The method of this work consists in (a) utilizing census reports and supplementing 
census information, and (b) combining the best interpretive analyses of governmental 
and non-governmental origin. In this way the authors are able to achieve a homo- 
geneous presentation of this subject which would be impossible if based solely on census 
or government works. 

While of particular interest to the specialist, this book is also recommended to non- 
specialists interested in various aspects of U. 8. population. The authors present many 
valuable statistical tables together with careful analyses and suggestive interpreta- 
tions. The book is well written and contains an excellent index. In addition there is a 
valuable appendix of sources for national demographic statistics since 1790. 


Manual of the International Statistical Classification of Diseases, Injuries, and Causes 
of Death. Volumes 1 and 2 of International Classification of Diseases. 1955 Revision. 
World Health Organization. New York: Columbia University Press, 1957. Pp. xli, 393; 
xxvi, 540. $6.75 for both volumes, not sold separately. Also available in French and Spanish. 


CCORDING to an announcement by the World Health Organization, “The revised 
edition of the Manual has been little changed as compared with the previous one. 

No alterations to the actual structure of the classification have been made... . The 
main modifications are essentially intended to improve the existing provisions or to 
make them more precise. However, in the section dealing with neoplasms, new four- 
figure categories make it possible to classify certain neoplasms according to anatom- 


ical site in more detail. Some changes have been made in the Nomenclature Regula- 
tions in order to render certain of its provisions less rigid. Finally, the rules to be 
followed for selection of the underlying cause of death have been modified or made 
more precise in certain points, so as to facilitate their application.” 

W. A. W. 


The Chronically Ill. Joseph Fox. New York: Philosophical Library, Inc., 1957. Pp. xix. 
229. $3.95. 


Harvey L. Smirn, University of North Carolina 


HE good will underlying this book shines clearly through. Its scientific merits are a 

little more difficult to discern. Aspects of the problems of chronic illness presented 
include: scope, definitions, effects upon individuals and society, problems of care, 
institutionalization and rehabilitation, economic costs, and ethics. 

The author states that the book is not a medical manual. The dust jacket refers 
to its sociological approach, but this is only minimally evident. One of its tasks is 
“assembling, balancing, and correlating the established facts and figures” needed to 
understand the prevalence and impact of chronic illness. This is the emphasis of most 
direct concern to statisticians. 

The figures presented are a rather haphazard collection, mostly from secondary 
sources, naively and uncritically handled. There is no real statistical analysis, nor 
even & sophisticated winnowing. The author’s most frequent statistical device is the 
simple listing of percentages. 

A few illustrations will show the book’s general tone and approach. Concerning 
muscular dystrophy: “There is some suspicion that heredity plays a part since 35% 
of the cases show a history of the disease in the family.” Another example: “Here are 
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a few facts that help. It has been said that ‘it is now possible that at least 50 per cent 
of chronic illness can be arrested if treated early enough’.” A study is cited in which 
41 old people are asked to voice their troubles. Their answers are reported as illustrat- 
ing “the severity and scope of problems which can beset older persons, particularly 
when health fails.” Explanations of figures presented are frequently at the level of 
“men prefer to keep going until seriously stricken,” or “women might be expected to 
complain more readily.” 

Fox has tried to render a service by pointing up the magnitude and effects of a 
health problem. His cause is a good one. It would have been better served by more 
knowledgeable statistical reporting and analysis. 


Chronic Illness in the United States, Volume IV. Chronic Illness in a Large City: The 
Baltimore Study. Commission on Chronic Illness. Cambridge: Harvard University Press, 
1957. Pp. xxii, 620. $8.00. 


Paut B. Sueatstey, National Opinion Research Center 


HE two field studies sponsored by the Commission on Chronic Illness are im- 

portant undertakings, both for their substantive findings on the prevalence of such 
illness and for their contributions to public health research methodology. The present 
volume reports on “the Baltimore study,” actually a series of researches carried out 
between 1952 and 1956. The plan was an ambitious one; probably, as the authors 
suggest (p. 332), too ambitious. First, an interview survey among a random sample of 
' 4,000 Baltimore families to obtain information about illness and disability among the 
approximately 12,000 individuals comprised in those families. Then, a “clinical 
evaluation” of a stratified sub-sample of 1,000 of these individuals, based on a com- 
plete diagnostic examination at a special clinic established for the purpose at The 
Johns Hopkins Hospital. Third, the administration of a series of screening tests to the 
remaining 11,000 individuals in the sample. Fourth, estimates of “needs for care” 
found among the 1,000 “evaluees.” And finally, attempts to rehabilitate a selected 
group of handicapped persons judged to have some potential for vocational rehabilita- 
tion. 

For a variety of reasons, the last three phases of the project provided rather incon- 
clusive results, although a great deal was learned in the process. The screening tests 
were largely unsuccessful because only 29 per cent of the designated sample could be 
persuaded to come to the clinic (p. 227). Estimates of needs for care were affected not 
only by problems of definition and by the subjective nature of the est:mates, but also 
by the fact that the needs of a substantial part of the “high disability” group were not 
estimated in detail (pp. 130, 175). The rehabilitation experiment, not yet completed 
at the time of writing, gave little promise of substantial findings (p. 150). It is con- 
sequently the first two parts of the study—the household interviews and the clinical 
evaluations—which command the most attention. 

The household interview survey, in itself a major undertaking, was executed by the 
Bureau of the Census with remarkable efficiency. The interview was conducted with a 
household informant (usually the housewife) for all related persons in the household; 
unrelated persons were interviewed separately. The questionnaire, a modification of 
the one developed for the parallel study by the Commission in rural Hunterdon 
County, New Jersey, covered illness and disability within the household “yesterday,” 
“within the last four weeks,” and “during the past 12 months.” The information was 
sought by a battery of questions, ranging from the open-ended “Were you sick at 
any time (yesterday, or during the past four weeks)? What was the matter?” to 
lengthy check-lists designed to remind the respondent of illness symptoms or condi- 
tions he might otherwise fail to report. In spite of the length and personal nature of 
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the interview, only 11 refusals to be interviewed are reported and interviews were 
completed in 98 per cent of the designated dwelling units. It is interesting that only 
six interviewers and a supervisor were responsible for what must have been approxi- 
mately 4,000 interviews over the course of a year. 

The sample for the clinical evaluations was drawn from the 11,574 individuals 
whose health had been reported in the household interviews. It was stratified in three 
main groups comprising eight cells, which were sampled at different rates to provide a 
high number of persons with substantial illness. Thus, all individuals in a “maximum 
disability” group, all of those with lesser disability but reported to be suffering from 
diabetes, neoplasms, or diseases of the central nervous system; 40 per cent of 
those reporting heart disease, 25 per cent of arthritic and rheumatic cases, but only 6 
per cent of those for whom no disease or only short-term or nondisabling conditions 
had been reported were included in the sample to be evaluated. The total comprised 
1,292 individuals, and intensive efforts were made to induce these persons to undergo 
the clinical evaluation. Unfortunately, in spite of all means of persuasion, only 809 
people (63 per cent of the sample) were willing to take the 4-hour examination, and 
the rate of participation varied for different groups in the population. Seventy per 
cent of the maximum disability group participated, but only 56 per cent of those who 
reported no serious illness; 81 per cent of the children under 15, but fewer than half of 
those aged 65 or over; 74 per cent of the nonwhite, but only 59 per cent of the white 
population consented to the examination. To compensate for the likely bias introduced 
by these differences in willingness to cooperate, a system of weights based on age, sex, 
color and reported health status was introduced, so that the weighted findings are 
theoretically representative of what would have been found had the entire Baltimore 
population been examined. 

The major substantive finding of the Baltimore study, and the reason for its im- 
portance, lies in the unexpectedly large number of chronic conditions diagnosed in the 
clinical evaluations, and the failure of the household interview survey to reveal more 
than a fraction of these conditions. The weighted results indicate that 65 per cent of the 
Baltimore population had one or more chronic conditions. The examining physicians 
diagnosed an average of 1.6 conditions per person, a rate far above any previously 
determined by the interview method. Fifty-six per cent of all these chronic conditions 
were judged to be “substantial”; that is, either disabling in some way or requiring 
care, or likely to in the future. The household interviews gave no hint of such findings. 
An elaborate case-by-case comparison of the interview data with the clinical evalua- 
tion data found that only 22 per cent of the diagnosed conditions had been reported 
by the household informant. Even after eliminating conditions which were not likely 
to have been known and so could not have been reported, only 30 per cent of the 
diagnosed conditions had been mentioned in the interview. For 14 illnesses, the clin- 
ical evaluation detected more than ten times the number of conditions which had been 
reported in the interview. For certain other illnesses, the number of evaluation di- 
agnoses was smaller than the number of household reports. 

One can raise certain questions about these findings, for diagnosis of a “chronic 
condition” depends in part, of course, upon the definitions used. For example, should 
“obesity” be considered a chronic condition, and if so, how is it to be defined? In this 
study, “somewhat arbitrary” criteria for establishing obesity were decided upon, but 
it was considered a chronic illness only if accompanied by certain other conditions 
such as heart disease or diabetes (p. 229). How about dental deficiencies? While it is 
stated (p. 33) that dental conditions should be classified as chronic illness, they are 
not so classified in the study’s prevalence tables, apparently because full information 
on all evaluees was not obtained. (Inclusion of dental deficiencies would raise the 
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“chronic illness” rates considerably and also accentuate the lack of correspondence 
with the interview data, since only 3 per cent of the survey sample reported any dental 
problems, while 80 per cent of the examined white sample were judged to have condi- 
tions requiring dental service.) Diagnosis depends also upon the thoroughness of the 
tests employed. The study notes, for example, (p. 93) that “It has been asserted that 
some x-ray evidence of osteoarthritis can be found in nearly every person 50 years of 
age or older. Thus if the clinical evaluation procedure had included routine x-ray of 
designated joints, the resulting prevalence rate might have been considerably higher, 
though perhaps only a little additional disease of current significance would have been 
discovered.” Considerations of this sort obviously affect the amount of “illness” di- 
agnosed. The characteristics of the examining physician also affect the number of di- 
agnoses. While no data are presented here, the comparable Hunterdon study indi- 
cates considerable inter-physician variation in the inconclusiveness of their diagnoses.* 
One might also question the extent to which the weighted clinical evaluations are rep- 
resentative of the total Baltimore population. While the controls introduced with 
respect to age, sex and color undoubtedly reduce some of the bias caused by the large 
nonresponse rate, it may well be that attitudinal and personality differences between 
the responders and nonresponders represent an uncontrolled variable which is cor- 
related with health status. 

Though the authors are careful to point out that the household interview may still 
be useful for other purposes, and though their findings on days of disability are based 
exclusively on the interview data, many readers are likely to conclude from this report 
that little or no reliance can be placed on public health statistics collected through 
personal interviews. Yet for some purposes, such data may be more significant than 
the results of clinical evaluations. In planning dental services and facilities, for ex- 
ample, the fact that only 3 per cent of the public report any dental problems is prob- 
ably more meaningful than the fact that 80 per cent may have needs for care. From 
one point of view, there is logic in restricting the definition of chronic illness to those 
conditions which people recognize as bothersome or disabling in some way, and in 
ignoring conditions which are unknown to the respondent or which he does not regard 
as illness. Of the 70 per cent of diagnosed conditions which “could have been reported” 
but were not reported, we do not know how many were forgotten or deliberately not 
mentioned, and how many were not reported simply because the respondent did not 
think them worth reporting. It may be, of course, that respondents report only a frac- 
tion of even those conditions which they recognize as illness, but no evidence on the 
matter is to be found in this study. Until such evidence is forthcoming, it would be 
inadvisable to conclude that the household interview technique has no place in the 
collection of public health data. 

But there can be little quarrel with the main findings of the Baltimore study. It 
is clear that the prevalence of chronic illness among the population is much greater 
than had been supposed; that only a small proportion of such health conditions can be 
ascertained by even the most intensive household interviews; and that researchers in 
this field must devise more effective means of inducing samples of the public to co- 
operate in diagnostic examinations if the quality of present morbidity statistics is to 
be improved. The authors are well aware of the limitations of their data. Approxi- 
mately half the book, including ten very valuable appendixes, is given over to descrip- 
tion and evaluation of the methodology, and supporting tables are almost every- 
where in evidence. The findings have been analyzed conservatively and skillfully. The 


* Elinson, Jack, and Trussell, Ray E., “Some factors relating to degree of correspondence for diagnostic in- 
formation as obtained by household interviews and clinical examinations,” American Journal of Public Health 
47: 3 (March 1957), p. 318. 


772 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1958 


presentation is for the most part lucid, though it suffers somewhat from an incon- 
venient organization of the materials and from an obvious multiplicity of authorship. 
Anyone engaged in publie health research or with an interest in the personal interview 
as a data-collecting instrument will almost certainly learn a great deal from a close 
study of the contents. 


The Blood Pressure in a Population: Blood Pressure Readings and Height and Weight 
Determinations in the Adult Population of the City of Bergen. Johs. Bgée, Sigurd Humer- 
felt, and Fréystein Wedervang. Bergen: A.S John Boktrykkeri, 1956. Also published as 
Supplement 321 to Acta Medica Scandinavica, vol. 157. Pp. 336. Paper. 


Donatp Marntanp, New York University College of Medicine 


HIS monograph is the outcome of an opportunity, provided by compulsory mass 

radiography in 1950 and 1951, to measure blood pressures, heights, and weights of 
a large number of the inhabitants of Bergen, Norway, who were over 14 years of age. 
The topics discussed in the seven chapters are: Previous surveys of blood pressure, 
height, and weight; Material and methods of the Bergen survey; Blood pressure in 
relation to sex and age; Height and weight patterns; Influence [sic] of height, weight 
and age on blood pressure; Effect on blood pressure patterns of excess mortality in 
higher age groups; A two-dimensional description of blood pressure (simultaneous 
consideration of level and of amplitude, i.e., pulse pressure). 

The literature review is comprehensive, informative, and critical. In discussion 
of the Bergen survey careful attention is paid to observational details, such as the 
influence of an obese arm on blood pressure readings, and the effect produced on the 
numerical analysis by the observers’ tendency to read pressures, not to the nearest 5 
mm. as prescribed, but to the nearest 10 mm. Statistical analysis, by standard meth- 
ods, was performed as part of a thesis in the field of economics by one of the authors 
(F. W.) who gives lists of assumptions underlying such techniques as regression 
analysis, but has failed to correct such phrases as “the effect of height, weight and age 
on blood pressure.” (Reviewer’s italics.) 

Among the results emphasized by the authors are: 

1. The low value of the regression coefficient of blood pressure on weight (systolic 
3 mm. and diastolic 2 mm. per 10 kg.). 

2. The frequency of overweight, especially in females, when the average for young 
adults is taken as the standard (e.g., 20 per cent of women between 45 and 64 years 
old weighed at least 20 kg. more than the standard). 

3. The indication, from available vital statistics, that the leveling off of blood pres- 
sure in the higher age groups could be explained by mortality. 

An enormous amount of careful work has gone into this study (including the pro- 
duction of 106 tables in the text and 60 pages of appendix tables), and the authors 
are probably justified in believing that it approaches more nearly a total-population 
blood pressure survey in a large community than any of its predecessors. They 
admit that such mass studies serve chiefly as backgrounds for more “limited and re- 
fined studies of the development of blood pressure,” but they claim that their 
investigation “should give a reliable picture of blood pressure patterns as they exist 
in the entire population of a city going about its normal daily life.” In trying to 
decide whether this picture, created with so much effort, has much value, and whether 
its statistical minutiae (numerous means, sigmas, and regression coefficients) may not 
be actually misleading, we note two facts: 

1. The investigation did not include health histories or clinical examination. It is 
merely stated that representative groups within the various blood pressure groups 
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were called back (presumably without power of coercion) for thorough clinical 
examination, and that the outcome will be published later. 

2. Blood pressure readings were made on only 75 per cent of the total 88,339 per- 
sons, over 14 years old, eligible on January 1, 1950. Of that total, 7,868 were exempt 
frora attendance for various reasons (dead, left Bergen permanently, already mass 
x-rayed in other places, students temporarily absent, already on Tuberculosis Reg- 
ister, certified medically unfit to attend). Another 10,016 persons failed to attend for 
x-raying; two reasons suggested in the report are emotional reactions connected with 
BCG vaccination, and sickness, especially in the older age groups. The selection rates 
varied greatly with age, from 47.9 per cent of the males aged 15-19 to 89.3 per cent of 
the females aged 40-44. 

The missing 25 per cent recalls forcibly a statement by Bradford Hill: “I would 
therefore myself infinitely sooner have, say, a one in four sample of the population, 
of a size thereby which enabled me to pursue relentlessly, and complete the records 
for, all or nearly all the persons in it, than have to interpret the figures derived from 
survey of the ‘whole’ population from which finally a quarter was missing.” 

Hill’s statement occurs in the report on a conference organized by the Medical 
Research Council of Great Britain for discussion of The Application of Scientific 
Methods to Industrial and Service Medicine (London: H. M. Stationery Office, 
1951), and the same report reveals experiences of health surveys conducted by the 
Council’s Pneumoconiosis Research Unit—experiences that led Richard Doll to say: 
“It is perhaps generally reasonable to allow a lapse rate of up to 5 per cent, but a 
lower one should always be aimed at and it must be realized that anything appreciably 
higher may materially bias the results.” 

Two pieces of evidence seem to satisfy the authors that the missing persons did not 
differ in blood pressure from the observed persons: 

1. As a result of the x-ray examination the carcs of some persons were filed in a 
separate register for further investigation by the Tuberculosis Department; in 719 
such persons the blood pressures differed little from those of other persons of the same 


2. Small variations in blood pressure were found between occupations and social 
groups in a “limited survey” of males aged 40-45. Numerical evidence is not supplied. 

Even the statistical co-author appears to be unaware of the fallacious inferences 
that, as Joseph Berkson pointed out (Biometrics Bull., 2 (1946), 47-53; Proc. Mayo 
Clinic, 30 (1955), 319-348) can arise from a kind of competition between selection 
rates. Although generally discussed with reference to attributes, this mechanism can 
affect measurements also. Here, for example, the door to the bias would be open if, 
besides the observed age differences in selection rates, different blood pressure levels 
had, independently of age, different selection rates, which is not inconceivable, be- 
cause blood pressure and the responses to propaganda for mass radiography may be 
associated with each other by emotional, socio-economic or other links. 

In view of the mystery surronding the factors responsible for blood pressure differ- 
ences, we have no right to shrug off the risk of competing-rates bias. This mystery, 
incidentally, was met in the Bergen survey itself; blood pressures in the northern 
and southern sections of the city differed significantly. The authors could find nothing 
to explain the differences, but they did not let this cast doubt upon the statistical 
quantities derived from the sections separately—as if each section were homogeneous 
in itself. 

It does not appear likely that we shall be led nearer the solution of the mysteries of 
blood pressure by superficial information regarding a large but defective and hetero- 
geneous sample. 
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Health in California, a publication of the State of California Department of Public Health, 
Malcolm H. Merrill, Director. California State Printing Office, Documents section, North 
Seventh Street and Richards Boulevard, Sacramento, California, 1957. Pp. 96. $1.50. 
Paper. 


Wituiam G. Mapvow, Stanford Research Institute 


TS sample survey is increasingly being used as a tool in research and applications 
in various areas in which members of this Association are interested. Many of the 
reports of such surveys are incomplete in that they neither measure nor discuss the 
sources and magnitudes of errors of various kinds to which their results are subject, 
nor state fundamental reasons for omitting the consideration of these aspects of a 
survey. It is a pleasure, therefore, to review a publication whose authors have kept 
in mind the desire of their readers to understand the background of the statements 
made in the report, who have not claimed that they must “write down” to their 
readers because the readers could not otherwise understand, and who, none the less, 
have written a report that is clear and easily understood. 

The California Health Survey, a sample survey of the state of California, consisted 
of 52 statewide samples of approximately 200 households, one of the 52 samples being 
interviewed each week of the year. Sufficient call-backs were made so that information 
was obtained from over 96 per cent of the households containing members eligible for 
interview in the sample. 

There are discussions of the need for data on illness and its treatment, and of the 
uses of the findings of the study in a state health department. 

The study contains information on California’s population by age, sex, nativity 
and race, family income, marital status, activity status and migrant status. 

After discussing for the entire state the incidence of illnesses of various kinds, days 
of disability, and the medical care used including hospitalization, these topics are also 
summarized for various population groups. 

There is also a chapter on methodological qualifications in which sampling and non- 
sampling errors are discussed. Measurement of the latter is based on the post-enumera- 
tion survey. 

The report concludes with appendixes containing the questionnaire, more detailed 
tables of data than appear in the body of the report, and definitions of the terms and 
measures used. 

Other reports are promised and will be eagerly awaited by all concerned with 
surveys and health. 

Let us review the structure of this survey. The reasons for making the survey are 
made explicit. A pretest was made in which not only was the survey questionnaire 
tested, but also use was made of the records of hospitals and physicians to evaluate 
the quality of information yielded by the interview. (The report on this pretest has 
not yet appeared.) A probability sample was selected and adequate callbacks made. 
(No statement has been included of a training procedure for the interviews.) Sampling 
errors were estimated. A post enumeration survey to evaluate response was made. 
(Not much information from the post enumeration survey is in this report.) The uses 
to which the results will be put are itemized. 

This report thus provides a very good model of how to make and report a descrip- 
tive sample survey. Little analytic material has been included; much remains to be 
published; and of course many quibbles could be made. But this is a really good piece 
of work. 
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